Using ChatGPT for Your Fiction, Part 1: What is ChatGPT Really?

Filed under: Industry, Resources — 3 Comments

April 18, 2023

These days, using AI for anything can be controversial, but even more so for creative endeavors. Is it laziness? Will it ruin your book? Is AI going to reduce human artistry to the level of cottage industry?

At the risk of spoilers, I’m not anti-use-of-AI for your writing, but it is a complicated topic that will require multiple blog posts. For now, to figure out if ChatGPT is useful or not, we first have to define what it is, what it is not, and what legal challenges there might be in using AI-generated content.

What is ChatGPT?

First, an AI like ChatGPT is not the kind of AI you might be used to from fiction. It’s a machine learning program that collates data using predictive algorithms in response to input, using training algorithms that are designed to flag and label patterns in that data. ChatGPT in particular uses a language model called Generative Pretrained Transformer, in particular GPT-3; the model uses statistics to predict the probability that any given set of words will be followed by another set of words in that sentence, and then that a particular sentence should follow a previous one, and so on.

Okay, so what does that actually mean?

In essence, it’s a search engine that returns a summary of data rather than links to that data, collected into a single response that sounds natural and coherent, as close as possible to the context of your request. If your request doesn’t provide that context, it will use the highest statistical probability for the missing information; but it is still literal, and can’t provide what isn’t obviously contextual.

For example, here’s a response I got when generating suggestions about a character’s background. The program generated a lot, but when I asked “Does Mary have any siblings,” the response was:

“I’m sorry, but I don’t have any information about whether Mary has any siblings in the prompt you provided. It’s possible that she does, but it’s also possible that she is an only child.”

It was able to produce suggestions on what gear the character used, what her home life was like, and so on, but there was no statistical certainty in this question. It was either yes or no, and I hadn’t told it which way to jump. A human would typically respond by weighing the pros and cons of making the character an only child, but the computer can’t handle that. The language model ChatGPT uses can attempt to determine intent, but it’s only a statistical model, not skill at understanding subtext. One can argue that a human understanding of subtext is itself a matter of statistics, just without the math; but if you spend enough time working with ChatGPT you’ll come to realize that something’s missing even if you’re not used to interacting with machine learning in the first place.

It allows it to generate responses that sound natural and coherent, using the data it’s already been loaded with. Its statistical language model even enables it to produce relatively unique arrangements of words, though it can occasionally bring to mind that immortal observation by the linguistic philosopher Stephen Fry: “Hold the newsreader’s nose squarely, waiter, or friendly milk will countermand my trousers.”

So to put it even more simply, ChatGPT isn’t artificially intelligent. It’s a gigantic digital con . . . but a useful one. It can’t come up with anything new, but it can recombine things through statistical language prediction so that you can feel like you’re talking to someone about your ideas.

That’s where it becomes useful for an author: not to write something for you or edit or even as a replacement for your own creativity, but as a means for brainstorming and researching faster than you could on your own.

How Not to Use ChatGPT

It’s important to keep its limitations in mind as you use it, though. I have asked it math and science questions and gotten wildly divergent results, even when reversing the equations to check its work. Math may be a language, but it isn’t the language GPT is optimized for, so if you’re using it for figuring out acceleration rates for a spaceship, you’re better off doing it yourself.

It also can’t answer anything it’s not loaded with, which includes a lot of things in the last two years. The majority of its database cuts off at September 2021. If you ask it about something it’s not loaded with, say a book that was only published this year, it will respond with the most statistically likely subject it is loaded with, and if you say it’s in error, it will move to the next one until it dips below a probability threshold. At that point, it will either say it has no information on the topic, or it will say the topic does not exist — which, since all it knows is its database, is very true from its perspective.

Its language model will allow it to build on previous elements of any individual conversation, and you can have multiple parallel conversations at once; but information is not shared between those conversations, and it does not automatically add anything you input to its own language database. This is only done manually by the OpenAI programmers, which means that anything you input may be seen by human eyes, so treat it the same way you do Facebook or Twitter in that regard.

Will ChatGPT steal my stuff? Will it learn to be an author from my questions?

According to OpenAI, the company behind the software, input you give to ChatGPT may be used to improve it. That’s why I tell people not to input anything specific to your IP that hasn’t been published yet. However, that’s the same kind of rule you should use for posting in social media; even under US law, where what you generate is protected under copyright even if it’s not registered or published, it can be work to prove provenance and your best protection is a good paper trail. An email is much easier to track than a social media comment.

There are some factors that prevent your data from being used, though. As I said, the ChatGPT programmers review the information before it’s added to the general database. The program always starts each new conversation from zero, no matter how many conversations you’ve had with it. Human eyes are put on anything that gets added because they want to ensure that no personal information enters the database, and to keep biased, incorrect, and low-quality information from being used to build responses. This of course means that the biases of the programmers are what get programmed; the computer is incapable of being unbiased, because it is only what it is programmed to be.

(I will say, though, that I’ve done some tests and over time ChatGPT has become increasingly neutral in its analyses of anything related to real life, real-world beliefs, public figures, etc. Due to its statistical model, biases inherent in your own question, such as “Explain to me why Matthew Bowman sucks” might result in a “Some people say Matthew Bowman sucks because . . .” kind of response, but can also result in “Matthew Bowman sucks because . . .” or “My programming does not allow me to disparage Matthew Bowman.” It largely depends on how you phrase it, and you can find plenty of examples online of how people have found loopholes in ChatGPT’s programming, some of which have already been closed by the programmers.)

Even when it’s inputted by a programmer, though, OpenAI says the information is “generalized.” That is, the language model is designed not to provide specific responses to specific input; each time you ask a question in a different conversation, you’re supposed to get a range of responses. (Which is probably the specific reason why complex math fails, but that’s a guess on my part.) It seems as though the system is designed to avoid using specific blocks of text you input as a whole package to another user, due to the very nature of the GPT process rather than any promise on the part of OpenAI.

Regardless of the risk of a programmer noticing you fed parts of your manuscript to ChatGPT and deciding to make it a permanent part of the database, or the likelihood that it would regurgitate it to another user, the easiest way to prevent it is not to do it. Don’t put in personal information or intrinsic and unpublished elements of your own IP. There are plenty of ways to use ChatGPT without exposing yourself, and we’ll get to those in future blog posts.

Legalities of Using ChatGPT

That all said, there’s still a chance that someone else’s work could come out of a response to your request. There’s also the issue that copyright protection for AI-generated content is, speaking generally, a goram mess. So here are some things to keep in mind when using ChatGPT to assist your creative endeavors.

AI may be banned by a publisher or other entity. Some publishers, contests, awards committees, etc., are discussing or outright requiring a certification on your part that AI was not used for any part of your work. This is not likely to be sustainable, it’s a tiny minority, and has no bearing on most authors in the first place, but it’s worth mentioning. In the end, AI is terrible at being creative, so my personal prediction is that any creative work that substantially includes raw AI output isn’t going to win a contest anyway. The generalization I just talked about prevents it from understanding pacing, subtext, or drama. I’ll provide examples in future posts, but suffice to say I don’t fear AI forcing me to retire as an editor.

Substantial content might not be protected under copyright. In the US, AI-generated content cannot currently be placed under copyright. There are legal challenges pending, and it’ll be interesting to see where that goes for multiple reasons I might talk about some day as we see more of it progress. Meanwhile, in the UK, AI-generated content is protected, but it’s assumed to be the property of the owner of the program. Be aware of your country’s laws on fair use and transformative content. In the US you’re pretty much okay using ChatGPT as a brainstorming tool, but I don’t know about the UK. Sorry, Brits. Oh, and you blokes in Northern Ireland who aren’t actually part of Britain.

AI cannot understand context, nor is it an authoritative research source. I know someone who is a news editor for a popular website, and she just had to fire one of her journalists for using ChatGPT for research. It wasn’t specifically because of ChatGPT itself, but because the information the journalist was using wasn’t checked and was incredibly wrong. It’s like a search engine that can summarize and talk to you, but that also makes it worse than Wikipedia when it comes to being an accurate source of information. Don’t depend on it to understand the nuances of being human, to handle complex math for you, or to be at all a replacement for your own research. It can be an excellent tool to speed up your writing, but only if you know how to use the information in the first place.

Content may contain accidental plagiarism. While the GPT model works very hard to generalize, as I described already, things you get might be substantially similar to something someone else already wrote. This is a remote kind of danger, because if you give a room of ten authors the same writing prompt, you’ll get fifty different stories. (If you don’t believe me, try it yourself.) Even outright fanfic can diverge a lot based solely on individual style. However, it’s good to keep in mind that you shouldn’t depend on a program for major story elements; the best way to use it is for filling in incidental details, not whole swaths of a novel.

I always tell students that the first step in any new discussion is to define your terms. Now that we’ve looked at the basics of what ChatGPT is, what it’s not, and what it can’t do, I’ll go through how to use it to help your writing next time. After that, we’ll do some in-depth examples using both fantasy and science fiction settings.

But before then, I’m going to do a fiction review with a bit of a twist . . . because it’s not a novel, short story, movie, TV show, or audio drama. It’s a video game. See you next time!

Tags: Artificial Intelligence, ChatGPT, Copyright, Language, Machine Learning, OpenAI

Comments RSS feed

1 Comment:

Foxfier

April 23, 2023 at 15:56

Reblogged this on Head Noises.

LikeLike

Reply

2 Trackbacks / Pingbacks for this entry:

Using ChatGPT for Your Fiction, Part 2: Tooling | Novel Ninja

[…] Last time, I covered what ChatGPT is, what it isn’t, and some things to keep in mind if you choose to use it to help your fiction. Now we’re going to get into how to use it, or other machine learning programs, to aid your fiction. Though, first, I’m going to try to underscore some of the caution I tried to instill in the last post: do not mistake ChatGPT for an unbiased assistant, talking encyclopedia, or genius author. […]

LikeLike
Using ChatGPT for Your Fiction, Part 3: Idea Winnowing | Novel Ninja

[…] I got wiped out by illness last week, but a large number of you wanted another post on ChatGPT, specifically on the ‘idea winnowing’ I mentioned in my last post. (And if you missed the first post, explaining what ChatGPT is and how it works, click this shiny and well-crafted link.) […]

LikeLike

Novel Ninja