Can AI write your papers?

There is much talk about how artificial intelligence (AI) can write for us.

Nikki Gemmell wrote in The Australian newspaper that ‘We scribblers and hacks are staring at the abyss in terms of the chatbot future roaring at us’.

Professional copywriter Leanne Shelton lamented its impact on her business. She expects her copywriting business to take a 35 percent hit this year thanks to OpenAI releasing ChatGPT last November.

I am seeing clients experiment with a range of AI tools to help with their work too.

Yet, like Nikki Gemmel, I am not concerned about AI taking my job.

AI can help the writing process and will stretch us to think harder and better but is not (yet, at least) a match for human insight.

Let me explain why.

AI can’t make a judgement call
AI relies on humans asking really good questions
AI can’t explain how it arrived at its answer
AI’s writing ability is surprisingly poor
AI is inherently biased

Let me unpack each of these further.

AI can’t make a judgement call

Even when organisations (eventually) set up their private AI instance AI can only offer limited help. This is so even after proprietary data is fed into it and appropriate access permissions are set up.

Let’s imagine that we feed the past decade’s board and senior leadership team papers into a proprietary database. We then add an AI engine on top. Leaders and board members could enter queries such as: ‘What is our company’s data security strategy’. The AI engine would then ‘read’ all of its material and summarise it to explain what the papers say about our company’s data security strategy. That is useful as far as it goes.

But what if we asked it: ‘How could we improve our data security strategy?’. Again it would summarise what the papers in its database say about the potential risks inherent in our current strategies. Again, useful as far as it goes.

Assuming the information in the papers is both accurate and complete, the summary may be helpful. I also assume, but don’t know, if it would place the strategy at a point in time or give all the information equal weighting. For example, a five year old data security strategy would be out of date. Would it qualify the information from that strategy as being from five years ago, or merge it with all the other data security items? Would it give these equal weighting? I am not sure, but for this kind of information to be useful we would need to know.

The limitations become even more obvious when we ask the question that we really need an answer to. What would it say if we asked it: ‘What is the right data security strategy for our company in today’s context?’

This is where the human comes in. Opining on what the ‘right strategy’ for a specific company is relies on judgement. So far at least, AI doesn’t have the ability to make a judgement call.

AI relies on humans asking really good questions

AI can only answer the questions we ask using the data it has access to. If we ask the wrong question, we will get the wrong answer.

In my experience, asking the right question is a major part of the challenge.

So even accounting for all of our limitations, humans are at an advantage here. We can interpret the questions we are asked, which can be very useful.

If I ask my team to answer a specific question, and they realise I am off base, they can answer the question I asked but also provide me with what I really need.

They can do this because they understand the context in which I operate, which an AI tool does not.

AI can’t explain how it arrived at it answer

While it is fun to ask these bots all sorts of questions to see how they answer, they can’t explain their reasoning. This matters if, for example, we need to audit something.

Imagine if you reported to a regulator that customer complaints for a product like a credit card fell by 20 percent during 2023. The regulator will ask you to provide your evidence to have confidence that this is true.

In the current world you can unpack the data feed. You can explain where the data was collected and when, how it fed into a dashboard that generated the result.

AI doesn’t allow you to do this, it just asserts what it found using its own hidden processes.

AI’s writing ability is surprisingly poor

I put this to the test recently in a conversation with a client. Brooke had been playing with ChatGPT to see if it could help her write a risk memo on non-lending risk acceptance in digital processes.

The result was both unhelpful and hard to read. It identified that operational, cyber and compliance risk needed to be considered. While the information was true, Brooke already knew this and the output lacked context.

As a test, we put the response through my favourite writing tool, the Hemingway Editor. This involved copy-pasting the text from ChatGPT into Hemingway, which then evaluated the writing quality.

It assessed the quality was poor and gave a reading age of Grade 14. That means it was written at university level. It classified 13 out of the 20 sentences as very hard to read.

You might not think is a problem given many people reading risk reports are university graduates. It is, however, well above the grade 8 that I recommend for my clients to ensure fast and easy reading for busy executives. In contrast, this article scores at Grade 7.

We then asked it to improve the language of its original draft and re-tested with Hemingway. The new draft came in with a reading age of Grade 9, which was a significant improvement if we can ignore that the content was unhelpful.

I have repeated the test and had similar results.

AI is inherently biased

This is where the discussion gets really interesting. I have asked Chat GPT and Google’s equivalent, Bard, to provide me with information about topics that interest me.

I find it is useful when asking for facts. For example, which podcasts discuss board paper writing, or perhaps what art schools offer weekend life drawing classes in my city. The tool provides a tidy summary that is easier than hunting through links provided by Google or Bing.

I worry about its responses that include opinion, however. I had some fun and asked some personal questions to see what it would do.

For example: ‘How does the moon affect women’s health?’. Chat GPT claims the moon doesn’t affect women’s health. In contrast Bard described this as a contested area and offered a list of areas that are currently being researched. In this instance, the Bard answer was more accurate and more helpful.

In contrast, when asking about sensitive topics the answers were both contradictory and troubling. Both Bard and ChatGPT have strong views about topics such climate change and the move to electric vehicles among other things.

Both began by explaining that they were AI tools that could not offer opinion before doing just that.

Given AI is a tool coded by humans, those humans influence how it works and the results it gives. We need to be very aware of this and evaluate any results we receive accordingly.

My conclusion is that although AI is a fun tool to play with and can be useful for finding information, it needs to be used with care. It won’t replace human judgement any time soon. It will, however, push us to get better. We need to critically evaluate anything it ‘spits out’ and lift our own game so we are adding real value not just regurgitating facts.

I hope that helps.

Cheers,

Davina

RELATED POSTS