Syntactic nitpicking is bikeshedding. Remember, Guido makes choices that may not be obvious unless you're Dutch, but he has a pretty good track record overall.
A better critique would focus on whether the language actually needed an Enum construct. Most people get by most of the time without it. Python already provides many ways to do it (module variables, class variables, etc).
The main problem with Enum is that you won't be able to ignore it. Enums will start popping-up in many modules and packages, so it will have to become part of your core Python knowledge (things you have to teach to beginners so they can work with existing code).
Contrast that with named tuples. They can be ignored (i.e. you can treat them like regular tuples and you'll get by just fine). Also, named tuples are profoundly more useful (i.e. using them is one of the easiest ways to improve the clarity of your code).
And yet most large Python projects reinvent the concept of enums in one way or another. This includes Python itself, where enums could be used in many places in the stdlib. Actually, one of the main reasons IntEnum is part of the implementation is that it will be possible to replace stdlib constants with it (for example some socket.* and io.* constants).
This was discussed in the mailing list in the past, and I even had a prototype of CPython with constants replaced by IntEnum. Due to its being an actual int it retains backwards compatibility while providing very nice printable representation for the constants.
Clearly, you're a big proponent of Enums needing to be in Python (I would estimate that you've made several hundred emails, comments, and posts on the subject).
You and a couple other vociferous proponents completely drown-out the early commenters who valued language compactness and learnability over a pervasive new construct that solves a somewhat unimportant problem.
The fact that some large projects may have a use for Enums wasn't balanced against an impending avalanche of Enums being sprinkled in beginner code, small projects, recipes, etc. I expect the use of Enums will become pervasive simply because they're there, not because a given snippet of code actually needs it.
As a person who teaches Python to engineers, I dread adding yet another have-to-know construct to the cirriculum. Python is no longer a small language.
Also, there is a lot of machinery behind Enums (killing a mosquito with a cannon). So, we should expect that people will hack it, abuse it, twist it into knots, subclass it, invent tricks will it, develop funky idioms, etc. IMO, none of this is worth it. The "problem" wasn't that important to begin with.
That said, the discussion is moot. There is no need to defend the proposal any more. Guido has accepted the PEP and it will enter the language, for better or for worse.
If we had to have a Enum, the one that was accepted was a reasonable choice. For the most part, the discussion did a nice job of considering existing recipes, possible use cases, and being as Pythonic as possible.
My only criticism of the process is that handful of enthusiasts (Eli, Ethan, Barry, etc) launched an avalanche of implementation and api-choice emails that buried and ignored the posts suggesting the language would be better-off without Enum. AFAICT, there was no honest discussion of simply establishing best practices using existing tools.
That's not an entirely fair way to put it, Raymond. Yes, some folks said that maybe Python doesn't need enums at all. But many more said Enums should get into Python, including Guido. The issue was sealed in a face-to-face discussion during the recent PyCon language summit in which many core devs participated. So presenting this as some evil ploy to "drown the opposition" in a flood of email is... unfair.
Naturally I participated in the discussion, but the vast majority of emails I sent was to steer it long after the decision to add some kind of enumeration was made.
It depends. In our code we have constants integers used as kinds marker that get stored in database. Messing them would break things bad. And we have them everywhere. And we have maybe over a thousand of them.
Had we had this enum from the beginning we would now probably have a code base in a slightly but significantly better shape.
If you can't distinguish between enumerations and named tuples, I think you have a bigger problem than you think. I mean this in full seriousness. Conceptually, the two constructs have completely different uses. That you can twist and bend one to implement a faint shadow of the other doesn't mean they are the same.
>Conceptually, the two constructs have completely different uses.
I'm not the one you replied to, and it's tough to not sound sarcastic on the internet, but I'm honestly curious what the purpose of an Enum type is. I didn't really get their purpose when I took Java I, and I don't really see why Python needs them.
What is the advantage of declaring an Enum vs something like,
Because if you read the function signature of a function like def fill_rectangle(color) you haven't got the slightest idea what valid values of "color" is. (Ignore the fact that python doesn't have types in the function signature to start with, say you're reading the documentation instead)
If you knew that color was of the enum-type my_gui_lib.standard_colors you know exactly which colors you can use and you can reuse them for fill_ellipse also because it most likely uses the same type. Instead of every function having to reiterate all valid integer values of color or referring to some table. Your editor will also provide you with auto complete of valid values unless you're coding in notepad.
In a type safe language you will even get compilation error if you provide a type/value out of range but since python is dynamic this would be hard. It's simply impossible to pass the function an invalid value, which is invaluable. So in python the value added isn't a strong as in a strong language but it's still there.
To expand on what the other poster said: VALUE_ONE, VALUE_TWO and VALUE_THREE are ints. That is their type. In most cases, though, groups of constants have some conceptual type that is entirely separate from integers. It's nice to formalize that concept in our code with an actual type. That way, when we say a function accept that type, we can't also pass it a random integer. Or returns that type, we can't return a random integer.
There's no enforced contract for a particular type, but a way to create entirely new values that have no relation to any existing value. I.e. you compare a value to MyEnum.SpecialUndefined and are sure that no None or 0 will yield True. OTOH, if someone would be determined to create a thing with the same conceptual sense as SpecialUndefined, he could explicitly define that it is Equal to MyEnum.SpecialUndefined.
In C/C++, with you use an enum type in a switch, the compiled will warn you if you are missing one. If you pass an enum type as argument, there will be a compile time check if the constant is part of the enum (you can force them, but it is not the point). Having a grouped set of constants also make sense in an organizational POV. In Java, enums are classes while in C/C++ they are only a set of constants. Enums in different languages are implemented differently and have various kind of extra features. But deep down, they always serve the purpose of storing a related set of constants.
If Enum is a class, making a subclass should make a subclass like normal. Not change how code is parsed within the body of the class definition!
And the same applies to subsequent imaginary suggestions.
If you want a syntax change then introduce a new 'enum' keyword to signal that we are dealing with a new syntax.
Similarly, if you want the statement 'red = 1' to result in Color.red == 'red', that is magic. It doesn't make sense. It should give you 1. The value you just assigned it.
I agree that passing a string to make a class (in the way of namedtuple) is ugly and undesirable and I would never use it.
But I don't think Mr. Cooke has made any suggestion that is better than "design by committee". Code which looks like Python, in a Python file, should behave like Python rather than throwing bizarre curveballs like implicitly turning 1 into "red" or a bitfield.
I don't agree with his suggestion, but it wouldn't require a change to the parser. You can already pull that off using metaclasses. Something like this (untested):
class EnumBodyDict(dict):
def __init__(self, *a, **kw):
self._keys_accessed = []
dict.__init__(self, *a, **kw)
def __getitem__(self, key):
self._keys_accessed.push(key)
return dict.__getitem__(self, *a, **kw)
class EnumMeta(type):
@classmethod
def __prepare__(metacls, name, bases):
return EnumBodyDict()
def __new__(cls, name, bases, classdict):
next_enum_value = max(classdict.values()) + 1
for name in classdict._keys_accessed:
if name not in classdict:
classdict[name] = next_enum_value
next_enum_value += 1
return type.__new__(cls, name, bases, classdict)
class Enum(object, metaclass=EnumMeta):
# proposed enum implementation here
You're right, this can be done without changing the Python syntax and in fact there was a proposal (with implementation) on the table for a long time. But eventually it was deemed to be too error prone and the more explicit approach was chosen. Python often resolved a tradeoff in favor of explicitness, and IMHO this ended up being a good decision.
Every point raised by this post was discussed and rehashed multiple times, debated, argued over, complete with emotions. The end result was something that feels right to the vast majority of the discussion participants (give or take a few minor details.) Anyone has the right to criticize, and anyone also has the right to participate in the design and decisions - this is open source. Sadly way more people choose to take advantage of the former freedom than of the the latter. But that's old news, just the way things are.
Every language/framework/system has features some people won't like. Python has many too. But Python did a great job at striking a careful balance between expressibility and syntax, and most programmers who have to deal with it love it. Enum's design was guided by the same principles. Given that we didn't want to add new syntax to the language just for this feature (Python's minimal syntax is one of its greatest strengths), we had a few challenges to face. We tried to follow Python's overall guide of explicitness, while allowing numerous features.
The end goal? To replace the plethora of hand-cooked enums almost every large Python project has (including examples within Python itself).
Finally, remember that many issues here are issues of style and personal preference. It's not far from a brace style war really.
One minor correction about named tuples. The syntax for field names can be an iterable or a string. The preferred way to do it is:
> Person = namedtuple('Person', ['name', 'age', 'gender'])
The string form is there to accomodate use cases where the field names are being cut-and-pasted from SQL or CSV headers or somesuch. Also, some people find it easier to type:
> Person = namedtuple('Person', 'name, age, gender')
All the same ways were chosen for the Enum functional API, by the way (list, dict, string separated with either spaces or commas). In general, the functional API was inspired mostly by namedtuple. It's not that the namedtuple syntax is pretty. It's not, we all know that, but that's what Python can offer at this point. But it's good enough so people learn to live with it, get used to it and intuitively understand what it means, and that's good enough.
That said, the preferred and recommended approach to define enums is with the class syntax.
"only real achievement is not doing anything new". That sounds like an excellent endorsement for a new Python language feature. Leave the crazy syntax innovation for hipster languages with unreadable, poorly thought out features. I've got work to do in Python.
Even as someone who hasn't done any real work in such languages, your statement strikes me as viscerally true. Those features are an obvious boon for reliability.
this creates a type that acts exactly like an enum. With the added benefit of having a well-defined string representation and maxBound/minBound constants for free.
It's sort of an accident: the inventor of ML (which SML, OCaml, and F# are descended from, and Haskell has liberally borrowed from) is the "Milner" in Hindley-Milner type checking, and algebraic types are one of the things that make Hindley-Milner useful.
Anyone who's used the object-oriented part of OCaml can tell you that inheritance can mix with Hindley-Milner awkwardly... but I'm not sure that really means algebraic types and pattern matching couldn't be integrated into Java, Python, or whatever somehow. For all I know, maybe it's already on the way into C++...
There is a reason why pattern matching syntactic sugar is not in python. Pattern matching is very powerful, but it is also ambiguous. And it is impossible to come up with good looking syntax, unfortunately.
Ambiguous, because in the:
# y = 5
#
match x with:
1: "my string"
2|3: print "2,3"
y: print y
there is a problem. Because, really, in case of 'y' you want to be able to do both: bind variables; use already defined variables. And there is no way you can do both with clean and concise syntax. On the other hand, there is already a way to achieve the same results with if,elif,else. That's why (sadly) pattern matching is not there.
What I was saying, is that if you are to complain about lack of pattern matching syntax in Python, you want to consider that such syntax require concise and unambiguous handling of a number of cases:
matching a value of variable:
|z when z = y -> print y
free matching with a bind:
|y -> print y
matching several values:
|2|3 -> print "2 or 3"
matching tuples:
|2,3 -> print "tuple 2,3"
Now, if I'm trying to come up with such syntax for Python, I see immediate problems. For example, lets consider following syntax for matching several values:
match x with:
1: print "my string"
2,3: print "2,3"
Would it be equivalent to matching a tuple (2,3) or matching one of the values 2 or 3? Ambiguous. So you need some other syntax. Let's try a few variations:
match x with:
1: print "my string"
2:3: print "2 or 3" # plain ugly
2|3: print "2 or 3" # interferes with '|' op
in (2,3): print "2 or 3" # too complicated
x in (2,3): print "2 or 3" # elif was simpler
Or consider:
match x with:
1: print "my string"
y: print y
Would that be matching a value of variable (|z when z = y -> print y), or free matching with a bind (|y -> print y)?
Having experimented with that a bit, I have a feeling, that it's just impossible to make up 'match' syntax for Python that would fit into the language. And I think that's why it is not there. And why we don't even want match syntax there. Now, of course I'd be happy to change my opinion, if somebody would rise up to the challenge and come up with some syntax that fits.
Every language solves that with a desambiguation rule. In haskell, the first matching line runs. It's no worse than the way C, Java, and family solve the if-else ambiguity, for example.
Could you elaborate on this? I would agree with "Haskell is not an ML" but I think the MLs (SML, OCaml, Mythryl, F#) are close to Haskell, and closer to Haskell than to other programming languages, certainly closer than to Scala. Haskell is lazy, pure, and has typeclasses, while MLs are strict, impure, and (sometimes, usually) have modules, ... while there might be a lot of other differences, there are a lot of similarities too, between algebraic datatypes, Hindley-Milner, pattern matching, etc...
ML (Ocaml, I should say, because that's my experience) is like a functional C. As a production language, I'd recommend it highly. It has excellent garbage collection. It generates blazingly fast native code.
I can't speak either way about Haskell's use in production. I'm sure that many people have made it work. I just don't know how hard it is in practice.
Haskell is lazy, for one, which makes it hard to reason about performance. You can have production memory leaks that are a bitch to debug. It also has a much more powerful, but also more complex type system. Explicit functors (Ocaml) are replaced by implicit type classes, which make the language more attractive (I'll grant that) but also make it easier to hang yourself by the monads.
Haskell's still a great language, and there are a million things that recommend it. However, I don't see it as occupying the same space as the ML family. The main similarity is that both use Hindley-Milner type inference, but there are a lot of differences, too.
Finally, Haskell is pure (except in the IO Monad, and a couple others) while Ocaml's not. You have stateful arrays and ref cells in Ocaml, and use them all the time when you're writing high-performance production code. In Haskell, any IO is to be done "in the IO monad" (which means, "in the context of evaluating an IO a", the latter being a thunk that does IO and returns an a.)
It also has a much more powerful, but also more complex type system.
To be precise, it's type _inference_ of Damas-Hindley-Milner + Typeclasses that poses the most gotchas. (Plain vanilla DHM is just ML. You can hack Haskell without class, but you can't get the class out of Haskell.)
Experience with Prolog helps, since a form of logical deduction, generally speaking, is what's happening behind the scenes. You could think of the compiler as applying AI to figure out the types. This kind of AI is magic when it works and baffling when it barfs.
One way out is to skip the AI entirely and annotate all the types by hand. Now the compiler only has to check types, not infer them. When it stumbles over imperfect code, the error messages become a lot more decipherable.
A shallow nitpicking of language syntax. I would rather read about how a programming language fares at solving difficult problems. Not a minor quibble that has no impact on anything other than this programmer's sensibilities. It's the kind of thing that makes me think people like this are a type, a type that want to abandon semicolons and jump on languages that look better but perform worse.
Semicolons are a triple cognitive overhead (errors from omitting them, irritation that they add no value in well-formatted code, and ability to write very poorly formatted code), where their absence has the only overhead of unnatural ways of breaking long lines, so typical programmer performance should be better.
I actually disagree with basically all of the points raised in this article. Sure, apparently Enum doesn't fit your particular desires for it - but there's good reasons why those choices were made.
Colour = Enum('Colour', 'red, green')
There's a difference here between 'name' as in where in the namespace a value is listed, and 'name' as in what descriptive name something has. For instance, if you were to do this...
Would you want the values from Tint to suddenly say that they're different values from the values in Colour, even though it's the same underlying object? I doubt it.
Similarly, regarding having to assign numbers: this makes a hell of a lot of sense any time you're trying to serialize things for interoperation between multiple services. There's a reason why protobufs do it this way as well: you get really powerful backwards-compatibility options, which you might not realize you need at first but become extremely handy down the line.
On Wed, Feb 27, 2013 at 2:03 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
> I'm beginning to see why enums as a class has not yet been added to Python.
> We don't want to complicate the language with too many choices, yet there is
> no One Obvious Enum to fit the wide variety of use-cases:
>
> - named int enums (http status codes)
> - named str enums (tkinter options)
> - named bitmask enums (file-type options)
> - named valueless enums (any random set of names)
> - named valueless-yet-orderable enums (any not-so-random set of names
That's probably the best succinct description of the core problem I've
seen, and I've been following the various enum-related dicussions for
years
Cheers,
Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Mildly unrelated, but when I think of the opposite ("imaginative" enums), the first thing that comes to mind is this amusing vignette on the design process of Java's enums: http://blog.des.no/2009/01/no-silly-its-enum/
How embarrassingly narrow-minded. Yeah, Java's enums are fully fledged classes. Which actually makes them extremely powerful and useful, more useful than enums in any other language. I've seen more than one C programmer unable to cope with such an extreme deviation from their bunch-of-integer-constants expectation.
The essential property (a type having a fixed number of named instances) is still there, so the name is fine.
I can fully understand that someone used to C enums would go WTF at seeing Java enums for the first time. What makes it narrow-minded is when this leads you to rejecting them outright and writing blog posts to ridicule them.
> What would you expect a Pythonic enum to look like?
>
> class Colour(Enum):
> red
> green
Hell no. I mean, even Go isn't that implicit, you still have to use the iota there. Anyway, this was considered, and you can see the reasons against it (standard explicit is better than implicit) here: http://www.python.org/dev/peps/pep-0435/#id38
> class Colour(Enum):
> red = 1
> green = 1
>
> Still, at least the mistake above would raise an error. Wouldn't it? Nope.
> That's a feature. If you mess up on the non-optional values then you get
> "aliases".
Aliases are a feature, like it or not. Having 100% unique enums would not actually be a good thing, seeing as how sometimes libraries support multiple names for the same enum value for backwards compatibility reasons.
And if you do actually want to prevent the duplication problem, you can do this
class Color(Enum):
red, green, blue = range(3)
red_alias = red
Or alternatively, if having to specify how many values is too unimaginative for you, something like
class Color(Enum):
red, green, blue, *_ = range(2**128)
red_alias = red
will also work.
> So, you go hunting around in the docs to see if there's any way at all of
> avoiding the need to assign values manually. And there is:
>
> Colour = Enum('Colour', 'red, green')
>
> which suffers from the same problems as namedtuples:
> - you need to repeat the class name (in a string, which your IDE is
> unlikely to check)
> - the parameters are themselves in a string, which your IDE is
> unlikely to parse and provide in auto-complete (they can be separate
> strings, in a sequence, but that doesn't really help).
>
> Now if two potentially useful library classes are suffering from the same
> problems than isn't that a BIT OF A HINT to try make things better? Nope. It
> just shows how important it is to not be imaginative. Or something (crack).
It's true! IDEs don't deal well with meta-programming, and Python isn't Lisp! I don't love the namedtuple syntax particularly either, but what solution would you propose? There doesn't seem to be an easy way around it, afaict.
> And it gets worse. What values do you think the above provides?
>
> Strings? That would makes sense (red = 'red'), in that it would display
> nicely and is going to provide easy to debug values. So nope.
>>> class Shake(Enum):
... vanilla = 7
... chocolate = 4
... cookies = 9
... mint = 3
...
>>> for shake in Shake:
... print(shake)
...
Shake.vanilla
Shake.chocolate
Shake.cookies
Shake.mint
We see that printing an enum value gives the string representation, so I don't really think that it not defaulting to strings is that much of a problem.
> Integers from zero? I mean that's how Python counts indices and there's "only
> one way to do it" so that's how Python counts enums, right? Nope.
I think I almost agree with you here. That said, zero does have a specific meaning in a boolean context, so I can see why they didn't go for it.
> OK, so bit fields? That way we can do cool Colour.red | Colour.green and
> make the C programmers feel at home? Nope.
Aside from the whole never going to happen argument, I think this kind of shows the problem with doing this implicitly; whatever you do, it's not going to work for some people.
> Give up? I'll tell you. It counts from 1. Presumably because it's really
> common to use the "if not enum" idiom. In someone's crack-addled dreams.
Hey, don't despair so much. You can always read the PEP and realise you can actually provide a dictionary as the second argument. So maybe you'd prefer
def MyEnum(name, vals):
return Enum(name, {val: i for i, val in enumerate(vals)})
> > What would you expect a Pythonic enum to look like?
> >
> > class Colour(Enum):
> > red
> > green
>
> Hell no. I mean, even Go isn't that implicit, you still have to use the iota there.
No, Go just provides plain boring wannabe enums.
Pascal(1972), Modula-2(1978), Ada(1983), and many others:
type Color = (red, green)
ord(red) = 1, ord(green) = 2
len(Color) = 2
succ(red) = green
pred(green) = red
red < green
type colors = array [Color] of int;
Ha! Too much Go for me. It does actually, my bad, I expected the variable to be ignored (and the range creation isn't a big deal usually as it's just a generator).
But even with a generator, as you would get from range in Python 3, the assignment
a, *b = range(10)
makes b into a list. (And I think the star syntax in assignments is only Python 3, as far as I remember Python 2 only has star in argument lists of functions.)
You're right, python2 doesn't have the extended unpacking (star) syntax. Furthermore, this:
>>> a, *b = range(2**128)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C ssize_t
Yes it does and that is because it actually assigns all of the rest to a variable named '_'. The use of the underscore as a throwaway variable is just a convention and the interpreter is not aware that the result should be ignored.
As much as I have qualms about Java, if there is one thing that it perfectly nailed, it's enums.
The Java enums can be as concise as the C/C++ one but since it's also a class, you can add all kinds of interesting stuff to decorate it such as constructors, getters, type converters, etc... Really a pleasure to work with.
I personally use enums more than I declare them, so I don't mind a little bit of namedtuple-like repetitive syntax. I'm already used to namedtuple, so I don't see anything wrong with reusing that syntax for enum. It's Python, not a brand-new programming language.
I will admit I kind of like Java's syntax for enums because it makes it very clear what exactly an enum is: a class with a bounded set of instances:
I never got python's lack of built in enum syntax. I get not wanting to add unnecessary keywords, but unless you're doing something like Scheme or Smalltalk where you're going for almost no syntax at all, you would think it would pay off to put in keywords for things that literally occur in every program. I feel like I should just be able to write this:
enum Color: red, green, blue
Right now enumerations are a mess. It seems like every project does them slightly differently, and the new enum library just adds a new way to do them slightly differently. That's like the opposite of PEP 8.
Well, any new enum syntax would be "a new way to do them slightly differently". This one just has the advantage of being in the standard library, and thus is presumably what new usage will gravitate towards.
Weird that he didn't address it, but optimizing for IDEs is not python's style. Although I'm sure the pains they went through with the 2to3 tool might make some of them second guess it.
You'd get heavily downvoted for asking "Did Peter Norvig not read the fucking article?", yet that's exactly what you've put into everyone's mind with your comment.
A stdlib implementation is a whole lot easier and less disruptive than new syntax in core. This counts for a lot. But of course you're vastly more limited in what you can do. The main benefit of this proposal that i can see is not the specific functionality it provides, but simply the fact that it makes it clear and explicit that this is a set of constants that have some meaning to the module - putting such constants in ordinary classes/namespaces (along with everything else) can obscure the fact they are not variables/attributes.
Syntactic nitpicking is bikeshedding. Remember, Guido makes choices that may not be obvious unless you're Dutch, but he has a pretty good track record overall.
A better critique would focus on whether the language actually needed an Enum construct. Most people get by most of the time without it. Python already provides many ways to do it (module variables, class variables, etc).
The main problem with Enum is that you won't be able to ignore it. Enums will start popping-up in many modules and packages, so it will have to become part of your core Python knowledge (things you have to teach to beginners so they can work with existing code).
Contrast that with named tuples. They can be ignored (i.e. you can treat them like regular tuples and you'll get by just fine). Also, named tuples are profoundly more useful (i.e. using them is one of the easiest ways to improve the clarity of your code).