Weird hiccup in chatGPT citation of statutes and court opinions

Darol_Tuttle · March 26, 2023, 1:31am

I asked chatGPT to convert Sec 104 of the tax code into a Shakespearean play. The result was instant and nothing short of brilliant. It takes much more effort with prompting for a summary of a statute. chatGPT 3.5 would not believe the prompt that eventually evolved into a leading question out of frustration. It wasnt until I said, Look, yes or no, does PR. Civil Code Section 1.231(d) spefically list “manure” as personal property that is “unmoveable” or not?!

Once the prompt became that specific, it conceded the premise of the prompt and seemed to have learned the distinction between US law and PR law, based on Spanish law.

What is throwing me, is that it seems to make up citations for court opinions. There is nothing about the prompt that would alert me to the answer as being defective to modify the prompt. However, when you double check the citation, there is no such case in existence. Legal citations are very specific, as you can imagine.

When I ran it in 4.0, it would NOT answer the prompt at all. chatGPT 4.0 got a 90% on the bar exam and chatGPT 3.5 failed, I hear. So, maybe 4.0 is such a good lawyer it avoids direct prompts if it does not know the answer?

I only found on ene other lawyer here. He is from Israel. Any other observations like this? I mean, if chatGPT can paint a Masterpiece, and write a play, but can’t do basic legal research… that doesn’t sound right to me.

Darol_Tuttle · March 26, 2023, 1:43am

Just making a record but I asked for clarification and source to verify. It listed Cornell, findlaw, etc. and told to consult with an attorney. I wil note I earlier asked about a specific IRS opinion letter referenced by number and it could not get the fact pattern correct. I noticed this earlier in an article to my bar association when I refenced an article I had read in WestLaw about a judge denying attorney’s fees because AI could have done the work for practically nothing. It found the wrong case in a completely different country and I could not prompt it to the correct fact pattern. The issue is not the prompt because it tells you when it can’t respond. Rather, it is answering the prompt correctly but with false information.

Ammon · March 26, 2023, 1:53am

GPT itself is a Language Model. In particular, a Large Language Model. AI as we have it today, isn’t ‘intelligent’ in the way living creatures are, they just simulate the seeming of intelligence through an understanding of language. They understand how words relate to one another, and all the legitimate patterns of words in usage.

So, while it understands the word ‘citation’ and can tell you all about an incredible array of how that word has been used across millions of documents, how that correlates to other words to mathematically model ‘relatedness’… It doesn’t really understand what a citation is. But it can generate something that looks like one, because it understands the pattern, just not the underlying concept.

BoomX · March 26, 2023, 4:23am

I think I get that. Mostly. My prompt was more akin to “list give court opinions in which the court pierced the corporate veil and the defendant was a Wyoming LLC”. My question orbits around a list of court opinions with citations that are false. THat is, the format of the citation is correct. But, there is no court opinion at the citation and the case name is not the name of a case.

Put differently, if the prompt was not artful or did not properly train, how could it possibly even try to list anything in answer to the prompt? It is good at punting when it does not understand. But, to answer the question, displlay a list of manufactured … Oh… I get it. chatGPT is predicting the coming words or string of figures. THere is something about statutes, court cases, and IRS letter rulings it just fails mierably in its prediction. Thank goodness the prompt isn’t the process for landing a plan.

One thing I wonder is liability. If AI can’t own a copyright, it certainly can’t be sued. It’s parent company can. But, in this case, the error isn’t based on a set of code. It’s the … model itself. Does that sound right?

Ammon · March 26, 2023, 5:30am

Yup, you have the false citations idea right. It knows what the pattern of a citation is, and in some cases it can even use right-sounding names for those who authored it, especially if there’s a pattern of those names appearing in lots of citations on [insert topic here], but it will be making up the title (based on what other citation titles look like), and it doesn’t understand that the citation should actually exist.

I think that part of that is absolutely a result of how it was built to work. They absolutely wanted an AI that could generate responses without plagiarising. So it never bases its replies on one source, or one document, but instead deliberately creates something from diverse sources distilled into a generally true pattern.

Great for everyone except when you want or need it to use one source and name that source. It was carefully and consciously designed NOT to work that way.

Laws… well, I’m sure you know as well as I do that the law mostly looks backward. Most laws are based on preventing something that already happened from happening again. Precedent and prior cases… The law looks backward. It’ll be literal decades before the laws catch up with where we are now, and by then, very little will still apply anyway.

What I can say is that AI is a long, long, long way from being ‘sentient’, and thus responsible for its actions. It is therefore a machine, however complex. The operator is the one responsible. With of course, caveats for where the operator was told by the manufacturer that it was safe to do something that wasn’t true, and thus a case for negligence on the manufacturer’s side could be a claim.

OpenAI with ChatGPT (and Google with Bard too) both very carefully explain that the AI is prone to errors, to hallucinations, and that the accuracy or fitness for purpose must be checked by the operator. The prompt author bears all of the responsibility in most cases even in regard to flaws in the AI, because they have been warned such flaws exist and that they need to be checking for.

That help?

BoomX · March 26, 2023, 3:05pm

That helps. Thank you. I will continue to read and experiment. I am curious, do you think that I could have trained chatGPT to select court opinions by matching fact patterns? It seems analgous to the training to identify the subject of a photo for captcha type purposes. It views images as numbers, I guess, and looks for patterns that indicate a boundary of the subject versus the background. Then, as I understand it, compares the boundary of the image of a cat to its current knowledge of a cat. It seems like it would be easier for something as mundane as a legal case, that matching key words, the Google method, is easier. You can’t match key word “cat” with a cat because cat’s dont have key words. But, legal cases do. It can correctly describe a Spousal Lifetime Access Trust and “talk about” a certain case, but it doesn’t “think” to find those court opinions that match certain data elements of a case with the prompt parameters? Is that it? I thought I had it, but then I lose focus on the kernel of it.

I AM reading and watching YouTube videos. The background of AI and the neural node research back in the day helped, a lot.

Ammon · March 26, 2023, 4:19pm

I think that is very unlikely with ChatGPT, but it might be possible if you were licensing and using GPT4 itself, without all the added limitations and caps on training they added in the ChatGPT version.

If you’re not aware, good old GPT3 could be trained further and customized to near infinite degrees from the starting point. Not via ChatGPT, but as a raw LLM that you train further, add your own ‘higher systems’ to, and use in your own applications.

BoomX · March 26, 2023, 7:00pm

REALLY? Good to know. Reading your answer, I hung on your use of LLM. It made no sense. LLM refers to a masters in tax amongst lawyers. Lol. You meant large language model. This is an example of humans not fully understanding the prompt let alone machine.

I am interested in leveraging chatGPT or any other technology to help me launch a softare application that drafts legal documents. NOTE: I do not want to use chatGPT as the document production engine. But, it can help with marketing, sales, coding, e.g. SQL, Python, etc. In every spare moment, I try to understand but am distracted by my practice, etc. As such, I am unaware of the history of chatGPT other than the few YouTube videos I have watched. I think you are saying that a prior version of chatGPT has some sort of licensing option that could allow me to costumize a prompt without the typcial restrictions? I do not know how chatGPT can be an ''engine" for an app. I AM experimenting with BotPress to see how chatGPT might embed into a bot. I have a github account and am trying to find resources and relationships there.

Ammon · March 26, 2023, 11:29pm

ChatGPT is one specific application of GPT (Generative Pre-trained Transformer). GPT is the name for OpenAI’s Large Language Model, in its various iterations (we are currently up to GPT4). Many other companies have their own Large Language Models (and some other styles of AI models) but what sets OpenAI’s GPT apart is that they license theirs to pretty much anyone. There are a huge range of third-party AI tools out there based on GPT2 for example, including lots and lots of the third-party “AI detectors” people rushed out to seize the opportunity.

The Language Models can be built around specific types of use, and often are. For some idea of that, you might like to do some research on MedPaLM which is a Medical specialist version of the PaLM language model (made by DeepMind which is owned by Alphabet/Google, and who are generally understood to be the world leaders in AI right now).

Actually, just doing a little reading up on the recent works and projects of DeepMind, even if only their own “In The News” pages would give you a great insight into how different AIs are either built for, or tuned for, specific purposes.

When Google were first rumoured to be planning to add their own AI alongside their search engine, I think @aiprm-christophc and I were both fully expecting Google to use a variant of their PaLM model - all of its uses seem to be strongly geared around accuracy, and giving correct answers reliably. Instead, Google went with Bard, which is a variant of the LaMDA Language Model which is more about dialogue, back-and-forth chatty applications.

It shouldn’t take you too long to at least get a really decent grip on the basic concepts of how language models differ, and how different ones are trained or tuned for specific purposes. I think doing so would be highly beneficial to your plans, as well as I suspect something you’d find interesting to learn anyway.

BoomX · March 27, 2023, 12:15am

I recognize all of those words as English and understood most of them. Progress! Not to keep you from your work, but I went to caselaw to make sure there were no cases in existence from chatGPT list. And, voila! They just rolled out a new product which is driven by chatGPT 4.0. I am test-driving it and started with the legal research tool and gave it the same prompt I mentioned above. It wrote something in the middle of a legal memorandum and a table of authorities, which I was after. That makes sense. This illutrates your point. They have been on it for some time, clearly. But, this validates everything you have said.

I have to admit. I am so excited its all I can think about. I do not seem to be boring people either. Even those who knew nothing about tech or AI hear the excitement in my voice.

Lawyers know how to craft words, right, persuade. Right now, I am supposed to be making a lesson on LLCs for copywriters. I am reworking my presentation to say that those skilled in language and writing can be the new developers, innovators. Code was a barrier. Not anymore.

I do not know Christoph, but I am like him. So much is happening so fast. Also, that dude’s voice. I had a radio shows for years and would LOVE to have that FM radio voice of his. The gods of broadcast must have been in a good mood the day he was born.

BoomX · March 27, 2023, 5:29am

I see what you mean. I popped over to Deep MInd and, WOW! That is what I thought aI would be. I read the alpha code artilce, which actually was more of what I expected AI to be. chatGPT makes it look TOO easy.

Ammon · March 27, 2023, 7:24am

Yeah. I thought MedPaLM in particular would be a great example of an AI tuned to a particular knowledge-based professional use. Last I heard it was still behind humans in giving an accurate diagnosis - but getting close, and (the part that really impressed me) it was out-performing human doctors on the metric of NOT giving harmful mis-diagnosis! That’s some seriously impressive ability in accuracy and prediction.

BoomX · March 27, 2023, 5:35pm

I gues I should give up on lawyers. We had already let financial services take over the estate planning conversation. I had become increainsgly vocal on a Forum about the need for lawyers to tost being afraid or uninterested in all things tech. I even approached the Wealth Counsel leaderhisp about a Master Mind to brain storm the situation to leverage chatGPT so we could catch up. They use a clunky APs.net version of the paper interview method. Horrible. My peeps are the ones to watch out for. If they feel threatened, look out.

As for medicine. kinda makee sense. Hopefully, medicine is baesd on science, unlike Law. Docs are linear compared to the chaos that Law manages, or tries to.

When AI research was just starting in the eighties, I was trying to learn Basic. And I thought I was smart. Does Basic even exist anymore.

Ammon · March 27, 2023, 11:35pm

BASIC is kind of like Latin - nobody is actively using the language enough for it to develop, so it classes as a ‘dead’ language, but you can still write functional code in it, or run old programs that were written in it.