In honor of Weizenbaum's Centenary, I asked ChatGPT to program ELIZA

abrax3141 · on Jan 9, 2023

BTW, in the long dialog that I had with it, ChatGPT created many different versions of this code with varying degrees of matching my spec. Every version sort of matched my spec, but at the same time each and every one of them was broken in various random ways. None actually worked out of the box. As with all of these grammatical confabulation engines, they fall headlong into the uncanny valley, looking sort of like a program, but when there's a right and wrong, they tend to end up on the wrong side.

execveat · on Jan 9, 2023

You're right of course, but given how much better (useful) ChatGPT is compared to it's predecessor GPT3, it is incredibly impressive.

Starting from the current state, you could make ChatGPT much better programmer simply via brute-force approach:

  1) ask it to generate (positive and negative) test cases first
  2) ask it to review the test cases it wrote to make sure they fit the spec
  3) ask it to produce an architecture of the program you want (you need to specifically ask it for writing modular code, consisting of small testable functions), fitting spec and test cases
  4) ask it to write the tests for each function
  5) ask it to find mistakes in the functions it wrote
  6) run the functions against tests it wrote and present to it failures, asking to fix them (loop here until fixed)
  7) run the e2e tests it wrote at the beginning and present mistakes to it, asking it to fix them
  8) run the whole process in parallel multiple times, until one of them works

It's not pretty, it's not cheap and it's not super robust, but neither is the code written by majority of programmers. And I'm sure you could make it understand the code even better using some kind of evolutionary algorithms, by letting it play with interpreter.

Terretta · on Jan 10, 2023

Your steps 1-8 are a great example of prompt craft.

ChatGPT is like power-assisting prompt-craft. Special case tricks and tunings make casuals rate ChatGPT assisted prompt results higher than GPT-3 alone. Some of this is discussed here:

ChatGPT -- ChatGPT is a sibling model to InstructGPT, which is trained to follow an instruction in a prompt and provide a detailed response.

See: https://openai.com/blog/chatgpt/

InstructGPT -- We’ve trained language models that are much better at following user intentions than GPT-3 while also making them more truthful and less toxic, using techniques developed through our alignment research. These InstructGPT models, which are trained with humans in the loop, are now deployed as the default language models on our API.

See: https://openai.com/blog/instruction-following/

But, if you carefully craft prompts and prime continuations, it seems GPT-3.5 (the same new davinci models) lets you color outside ChatGPT's lines to achieve arguably even better results once you have your own crafted prompt.

krisoft · on Jan 10, 2023

> Starting from the current state, you could make ChatGPT much better programmer simply via brute-force approach

In fact probably you can use a compiler and a human supplied test suit to generate training data for the network. Let it do the looping with the compiler where its output is fed into the compiler and if any error happens you feed the compiler output to the network. Then if it manages to write something which compiles you run the test and feed back any errors it got into the network.

If it ever manages to find a solution you add it to the dataset used to train the next version.

abrax3141 · on Jan 9, 2023

Actually, I did do a version of this. Note that in the posted code there's a weird comment that says: "...that includes the remainder of the input, with personal pronouns conjugated correctly", but the code actually does not do this at all. It seems to have mis-interpreted this as the pronouns in the sentence, as opposed to the pronouns in the part of the user's input. I even tried to give it examples, and it was always polite, but never got what I meant. Because you can't really have the sort of close-quarters interaction you can have with a student or co-programmer, you spend a lot of time trying to query-hack. I could have literally written what I wanted way faster than it took to fail to get it to understand what I wanted. (Which i never did.)

abrax3141 · on Jan 10, 2023

BTW, I don’t understand what the value is of making it create tests. It doesn’t actually run them. In fact, although I didn’t ask for them, it did create tests in some cases, but the code it wrote didn’t actually pass the tests!

abrax3141 · on Jan 9, 2023

BTW, in case you missed it:

    https://corecursive.com/eliza-with-jeff-shrager/

(Does it count as shameless self-promotion if I'm actually promoting someone else's highly relevant podcast that just happened to have me in it?)

msla · on Jan 10, 2023

Clickable link:

https://corecursive.com/eliza-with-jeff-shrager/

swyx · on Jan 10, 2023

i think corecursive gets hn approval regardless as long as relevant haha

RoddaWallPro · on Jan 9, 2023

ELIZA is mentioned in passing in Cormac McCarthy's latest novel, Stella Maris (which is a companion/coda to his novel that came out one month prior: The Passenger). I didn't know what ELIZA was when it was mentioned, and looked it up and that kicked off a long rabbithole of learning about early AI researchers. I thought it was funny that people in the 60s attributed human-like qualities to ELIZA, which we now view as a ridiculously primitive program. Perhaps the humans of 2073 will look back on ChatGPT/etc very similarly to the way we now look at ELIZA. Relevant quote from wikipedia (https://en.wikipedia.org/wiki/ELIZA):

"Some of ELIZA's responses were so convincing that Weizenbaum and several others have anecdotes of users becoming emotionally attached to the program, occasionally forgetting that they were conversing with a computer.[3] Weizenbaum's own secretary reportedly asked Weizenbaum to leave the room so that she and ELIZA could have a real conversation. Weizenbaum was surprised by this, later writing: "I had not realized ... that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people."

Wikipedia states that a version of ELIZA is still available in emacs under m-x doctor or something similar. Not being an emacs user, I haven't verified that, but I thought it was interesting.

abrax3141 · on Jan 9, 2023

You have to try to put yourself in a world where most computers were room-sized,a and there wasn't even anything smaller than what used to be called a "mini-computer" -- more-or-less a very large tower -- and no individual had one of their own. This was also 3 years before 2001's HAL 9000, so even before the idea of even talking to a computer was anywhere near the public consciousness. In that context, it's not too surprising that people had trouble understanding what they were talking to. OTOH, it's not actually entirely clear that JW's interpretation of what was going on in these folks' mind is correct. I mean, we rubber duck all the time...sometimes to a literal rubber duck (although usually in that case just to be funny)...so telling this inanimate thing your deepest secrets might be just a way to get them off your chest. We don't know, but my point is that it's hard to try to think about what someone in 1960 might have thought when asking to talk to a computer that was asking them seemingly intimate questions.

RoddaWallPro · on Jan 9, 2023

I think I might have misphrased what I meant. It is a truism that "people in the past find new technological advances in that past to be amazing, which we now view as trifles". It's more the fact that we now have AI programs which are crazy advanced compared to ELIZA, and yet we probably have very similar reactions now to ChatGPT, Dalle-2, etc, as those people in the 1960s did to ELIZA. And yet, in 60 years, barring civilizational collapse, people will look at ChatGPT and think "How quaint! How hilarious to think of that ludicrously primitive program as an AI!".

I guess that too is also a truism - "People in the future will look at our current technology and laugh", but somehow this one makes me feel things. The idea that what we have now is like a blunt 2x4 compared to the surgeon's scalpel that we might have in 60 years. (Or might not - who knows!)

dwringer · on Jan 9, 2023

I'd add to that, this is essentially a kind of auto-socratic explorer for a topic. It constantly questions the assumptions of the user, albeit in a rote and very simple prescripted way, and could still lead to insights or new perspectives for the users. This is also a major potential benefit of technologies like ChatGPT IMHO. If we expect it to know things we don't, we might be disappointed, but if we treat it as a well-meaning teacher with skills in pedagogy who depends on our own knowledge, I think we can reap great benefits.

User23 · on Jan 9, 2023

I think the strongest argument that AGI has been achieved is when ChatGPT vN can write ChatGPT vN+1 and have it be a Pareto improvement. Can we call it the User Test?

  Now for some real User power.

idiotsecant · on Jan 9, 2023

In that case, I don't think we're provably intelligent just yet!

User23 · on Jan 9, 2023

At least some human beings manage to produce new human beings that are at least approximately Pareto improvements by using a well chosen training set. There is a long term trend towards the mean though so we're not likely to see a biologically based singularity.

idiotsecant · on Jan 9, 2023

I think we already had a biological singularity! Many of them in fact! I think it's just likely that we were the final biological singularity.

I think a lot of the time we think of singularity as 'the point at which the rules break because evolution of some parameter is infinite" but I think really what this term means is the point at which current rules are incapable of predicting the evolution of the system, resulting in an apparent infinity. That infinity probably is not physically meaningful, but it might as well be from the point of view of an actor before the new rules are known.

When we evolved from ape-ish things capable of only the most primitive communication into things capable of transmitting complex information about our world across generations we hit a rate of technological growth that might as well have been a vertical line compared to the previous history of life on earth.

When single-cellular life managed to capture mitochondria and unlock incredible chemical energy reserves that was another similarly vertical line in the complexity of life.

I'd say our history is a series of singularities that only look like gradual progress from the point of view of our own pseudo-infinity. We are a step in the process of the universe understanding itself but we aren't the last step, it would be silly to think so, given what we know about past leaps in the entropy creation rate of life.

x86x87 · on Jan 9, 2023

One thing that bothers me when it comes to AI is how human centric the approach to AI is. We, humans, are the measuring stick.

Fundamentally this approach is flawed. Can one really talk about the intelligence without talking about the environment in which that intelligence manifests itself?

ben_w · on Jan 9, 2023

Sure we can talk about intelligence, we have done for at least a century, but you're right the definition we use is anthropomorphic and almost certainly missing important things — as demonstrated by all the times computers did a task, widely regarded as emblematic of high intellect, at super-human levels, only for humanity to collectively shrug and say "not intelligence".

sebzim4500 · on Jan 9, 2023

This cycle of

1. AI will never be able to do X because that requires real intelligence/creativity

2. AI does X to a (super)human level

3. X does not require real intelligence/creativity, it is just search/statistics/etc.

with increasingly difficult values of X can not possibly continue forever, right?

And yet, it seems to happen so reliably.

PaulDavisThe1st · on Jan 10, 2023

problem is that at step 2, the super-human performance frequently turns out to have massive holes in it. Playing games with fixed rules is one exception, but most other things that "AI cannot do but now does" always come with a shift in our expectations. I would say that the very best "AI doing amazing shit" tends to be things never discussed in Ye Olde Literature. Google Maps is a premier example.

x86x87 · on Jan 15, 2023

Yes it's superhuman for a very narrow set of well defined tasks. We observe it and claim that AI is taking over.

ben_w · on Jan 9, 2023

The terms are not very consistently applied, but I would call that ASI rather than AGI.

Personally I think the General in AGI is real-valued (or possibly vector-valued) and definitely not a mere Boolean yes-or-no, and therefore it's fine to say ChatGPT is an AGI because it has the generality of "text on most subjects in most languages".

Well… sort of. I'm not sure about the Intelligence part of the label though, because it's making up for being very inefficient with the training set by running on hardware that outpaces synapses to the degree to which marathon runners outpace continental drift. But this leads to the question "Does a submarine swim?" (I.e. does it matter?)

PunchTornado · on Jan 9, 2023

my experience with anything I ask it is garbage. ok, in that case it is an AGI of value 0.0000000001

ben_w · on Jan 9, 2023

Really? All garbage? Both of these conversations contain zero cherry-picking. Code works, solves what I wanted it to solve. I don't speak most of the languages in the video, but bits I could follow looked good to me.

https://github.com/BenWheatley/Studies-of-AI/blob/main/codin...

https://youtu.be/XX2rbcrXblk

Also used it to update the style on my website. Its solution has one bug that shows up on Chrome but not Safari, and was otherwise better than I could've done Googling because I didn't know the search term.

JoeAltmaier · on Jan 9, 2023

We used to write them in BASIC back at the UofIowa in the 80's. Lots of pattern matching for subject and object; lots of dialog trees. It was fun!

One observation Bob Bryla made was, count the number of words that begin with 'N' and treat each as a logical-negation. With a very small dictionary of additional negation words (e.g. can't) and a whitelist (e.g. nice) you could tell the sense of a statement or question pretty accurately.

Nope, never, not, nada, na, nobody and on and on.

abrax3141 · on Jan 10, 2023

Okay, so holy f'ing s#!t!

Out of curiosity, I today asked it to program the ELIZA in MAD-SLIP, the original (and now completely dead and obscure) language that ELIZA was originally written in. My expectation was that it would have no idea what MAD-SLIP is, or maybe would barely have an idea, and in any case would have no chance of doing what I asked and that it would, very reasonably, balk.

Instead, to my shock it made up a completely bogus meaning for the acronym: "Matched Activated Decomposition by Syntactic and Lexical Interchange Program", and actually wrote an ELIZA in this non-existing language, which was in reality just Python code using the rexexp (re) package, and was perfectly fine as far as it went.

Even scarier than confabulating a non-exisistent programming language, it confabulated a citation for it, claiming that it is described in Weizenbaum's Computer Power and Human Reason, which is utterly false!

Sigh. I fear that ChatGPT is merely bringing Weizenbaum's worst fears into reality! A seemingly authoritative AI that simply makes s#!t up! I'm feeling a bit like taking up arms at the moment.

sagacity · on Jan 9, 2023

That's so cool! One of the ways I learnt programming in BASIC was by studying the program outputs next to the listings in those Creative Computing books. ELIZA was one my favourites, so thanks for doing the port all that time ago.

ck2 · on Jan 9, 2023

I think I first experienced ELIZA on a PDP-11

What's really going to get spooky is someday some ChatGPT descendant is going to write a better ChatGPT and where does that sequence end?

jmull · on Jan 9, 2023

Well... even if some version of ChatGPT is able to write a better ChatGPT2, there's no particular reason ChatGPT2 should be able to write a better ChatGPT3.

And if it can, there's no guarantee ChatGPT3 would be able to write a better ChatGPT4, and so on.

Exponential phenomena tend to need the right environment to sustain, and, by their nature, quickly outgrow, disturb/ruin, or consume their environment.

Not that there couldn't be an AI explosion (as far as I know), but there's really no particular reason to think one is imminent -- especially since we haven't seen step 1 yet.

CuriouslyC · on Jan 9, 2023

Exponential growth with finite resources invariably yields a sigmoidal growth curve. I don't know how singularity crazed futurists still don't get that.

sebzim4500 · on Jan 9, 2023

Are there (effectively) finite resources when it comes to computation though? We are so unbelievably far from the limits arising from theoretical physics that we could probably have millenia of Moore's Law, without having the computer collapse into a black hole (or whatever other limits are known apply).

imtringued · on Jan 10, 2023

Someone linked a very long winded article about how compute hardware as powerful as the human brain will cost 1 cent in 25 years...

Come on...

Also from a planar perspective we hit the end of the road for silicon transistors. All new designs must be three dimensional to increase density and a millennia of Moore's law would imply that somehow your computational resources grow faster than the speed of light. I mean you physically cannot travel fast enough to acquire interstellar resources for usage in computation when you are doubling every two years. You are talking about millions of galaxies filled with data centers by the end of the millenia, maybe the entire universe, I haven't checked the numbers.

All I know is that even 3% growth over 2000 years gives illogical results like colonizing a dozen milky ways.

sebzim4500 · on Jan 10, 2023

Why do you think your computation resources have to grow faster than the speed of light? That isn't even a dimensionally consistent statement. Are you assuming that the FLOPs per unit volume per second will be constant? Or that the amount of energy per FLOP will be constant?

I don't think either of these assumptions is reasonable.

zaphar · on Jan 9, 2023

I think it's more accurate to say that chatGPT was trained than written. The advancements in this space have been primarily pushingb the parameter sets of the model high enough that you hit a threshold where you unlock an order of magnitude more capability than before. It's mostly a matter of how much compute and training data you can throw at it.

astrange · on Jan 9, 2023

The advancements since GPT-3 version 1 have mostly been better training. (Well, RLHF is more data as well, but not in the same sense.)

There's also this DeepMind paper (https://www.deepmind.com/publications/an-empirical-analysis-...) saying people need to focus on better training more than data. Of course, there's not much more freely available text available anyway.

idiotsecant · on Jan 9, 2023

>where does that sequence end?

Technological singularity, hopefully!

bitwize · on Jan 9, 2023

Hello, I'm Ray Kurzweil and welcome to my TED talk!

defitrader · on Jan 10, 2023

I think this is the best way to honor Weizenbaum's Centenary.

kshay · on Jan 9, 2023

Does it please you to believe I am of a certain age) or( if I am not...) I've almost certainly experienced a knock-off of your's?

floxy · on Jan 9, 2023

  // A list of ELIZA's responses 
  const responses = {"I feel": 
                        [ "Why do you feel {0}?", 
                          "What do you think is causing you to feel {0}?", 
                          "Do you often feel {0}?", 
                          "How do you normally deal with feeling {0}?" ],
                     "I want": 
                        [ "Why do you want {0}?", 
                          "What led you to want this?", 
                          "Do you often find yourself wanting {0}?", 
                          "How do you think getting what you want will make you feel?" ], 
                     "I need": 
                        [ "Why do you feel like you need {0}?", 
                          "What led you to feel like you need this?", 
                          "Do you often find yourself feeling like you need {0}?", 
                          "How do you think getting what you need will make you feel?" ] };

  // A list of keywords 
  const keywords = Object.keys(responses);

  // A function that generates a response 
  function generateResponse(input) { 
  
      // Check if the input contains any of the keywords 
      for (let i = 0; i < keywords.length; i++) { 
          if (input.includes(keywords[i])) { 
              // If the input contains a keyword, return a random response 
              // that includes the remainder of the input, with personal 
              // pronouns conjugated correctly 
              const response = responses[keywords[i]][Math.floor(Math.random() * responses[keywords[i]].length)]; 

              return response.replace("{0}", input.substring(keywords[i].length + 1)); 
          } 
      } 
  }
  
  // Test the function 
  console.log(generateResponse("I feel sad")); 
  console.log(generateResponse("I want a new car")); 
  console.log(generateResponse("I need some help"));

vages · on Jan 9, 2023

Not sure if Weizenbaum would be honoured.

> Weizenbaum, for his part, turned away from his own project’s expanding implications. He objected to the idea that something as subtle, intimate, and human as therapy could be reduced to code. He began to argue that fields requiring human compassion and understanding shouldn’t be automated. And he also worried about the same future that Alan Turing had described — one where chatbots regularly fooled people into thinking they were human. – https://99percentinvisible.org/episode/the-eliza-effect/

abrax3141 · on Jan 10, 2023

I’ve thought about this - quite a bit, in fact, and have come to the belief that one can separately respect and honor a person’s contributions. Both JW’s antipathy toward AI, and ELIZA itself, are significant contributions. Just because he rejected ELIZA (which, BTW, I do not think is the way to think of his relationship to it, but that’s a longer story for another time) does not make ELIZA’s influence and historical value vanish.

fsckboy · on Jan 10, 2023

not denigrating Weizenbaum's work at all, but he was a bit of a squeaky wheel on the "I fear for the future" technology/society/"oh the humanity" stuff. I'm pretty sure ChatGPT could do even better justice to that aspect of his career.

BarbaryCoast · on Jan 10, 2023

In Emacs: M-X doctor <ret>

YeGoblynQueenne · on Jan 9, 2023

>> ---- (Apparently ''' ... ''' doesn't do code in HN. Sorry for the wrap-around formatting.)

For pre-formatted code leave two spaces at the start of the line.

  Like this.

There's no special notation for code. Despite this being, you know Hacker News.

dragonwriter · on Jan 9, 2023

> For pre-formatted code leave two spaces at the start of the line

[...]

> There's no special notation for code

You just explained the special notation for code, which is so important that it is HN’s only block environment, and then turn around and deny it exists?

astrange · on Jan 9, 2023

I assume they want block notation and syntax coloring, rather than per-line notation and just monospace.

dang · on Jan 9, 2023

pg added that formatting option specifically for code, way back when he first wrote the software.

YeGoblynQueenne · on Jan 9, 2023

  Well then, I stand corrected :P

abrax3141 · on Jan 9, 2023

Maybe I'm missing something obvious but why can't we just write in MarkDown?

munificent · on Jan 9, 2023

Indented blocks for code is Markdown. The original Markdown language used +4 indentation for code blocks. GitHub later added support for `~~~` fenced code blocks and those became more popular.

lcnmrn · on Jan 9, 2023

Can you program ChatGPT with ChatGPT?