Well, the LLMs are still a work in progress, but the time is coming when their performance will outstrip human. The more the economic incentives the sooner that time will come. "What are people for?" is a question society should start looking intensely at. Right now.
Absolutely. I read the book “Futureproof” by tech journalist Kevin Roose a few years ago and strongly recommend it: https://www.kevinroose.com/futureproof
It's certainly not terrible, especially the performance of GPT-4. Behind the topline numbers, the inconsistency in GPT-3.5 seems the most concerning to me: decent average performance, but it only got 33% of the Qs right in all threat attempts, implying that at least the free model doesn't consistently get same question right when asked repeatedly...
Over time, it's unquestionable these models will continue to Improve. GPT-5 and equivalent will surely be more competitive with vet student, or even licensed vet, performance. I do wonder if it will ever be possible to stamp out the hallucination issue; if not, we will always need to exercise some diligence with these tools and cross-check them for high-stakes applications
Thanks for sharing these, Eric- It'd be interesting to see how ChatGPT and other AI tools fit (or not) into our interdisciplines. Personally, as long as they're used for tasks that humans no longer want to perform, and not take artistic and scientific autonomy, I can see its prolonged effects.
Well, the LLMs are still a work in progress, but the time is coming when their performance will outstrip human. The more the economic incentives the sooner that time will come. "What are people for?" is a question society should start looking intensely at. Right now.
Absolutely. I read the book “Futureproof” by tech journalist Kevin Roose a few years ago and strongly recommend it: https://www.kevinroose.com/futureproof
Excellent article. I am interested in the effects of AI on our profession. By the way, I am sharing your article with my readers.
Thanks, Jose! I'm glad you enjoyed, it's an important topic and I'm glad to see some research starting to come out in this area
Looks like a Gentleman’s C to me…
It's certainly not terrible, especially the performance of GPT-4. Behind the topline numbers, the inconsistency in GPT-3.5 seems the most concerning to me: decent average performance, but it only got 33% of the Qs right in all threat attempts, implying that at least the free model doesn't consistently get same question right when asked repeatedly...
Over time, it's unquestionable these models will continue to Improve. GPT-5 and equivalent will surely be more competitive with vet student, or even licensed vet, performance. I do wonder if it will ever be possible to stamp out the hallucination issue; if not, we will always need to exercise some diligence with these tools and cross-check them for high-stakes applications
Thanks for sharing these, Eric- It'd be interesting to see how ChatGPT and other AI tools fit (or not) into our interdisciplines. Personally, as long as they're used for tasks that humans no longer want to perform, and not take artistic and scientific autonomy, I can see its prolonged effects.