Well, the unsurprising answer – as with most philosophical questions – is it depends. Not only could we have a long and detailed conversation about what it means to be evil, but we could also cover every topic imaginable, every detail, and still somehow come to different conclusions (as is the way with human nature). To avoid going down what might be an endless rabbit hole of what it even means to be evil, let’s take a more practical approach.
Assuming that we all agreed that there was an action that by definition was evil and could only ever be considered evil, what would it mean if automation chose to perform that action? From a purely materialistic or logical approach, it wouldn’t actually mean anything. Not only would we not be able to determine whether the automation was evil, but we also wouldn’t be able to tell if it were good either. The same goes for every action that automation could take. With the current state of computer sciences, there is yet to be any evidence showing that automation can care in either direction. Even the most complex models that exist make choices to achieve the desired outcomes of that system. Some of those outcomes may be perceived as evil, which would then lead the automation to make decisions that would appear evil; however, there’s a clear disconnect between the decision being made by the automation and the automation’s understanding of what that decision even means.
One of the underlying reasons is that automation itself does not have many of the human characteristics that fictional writers often portray them to have. A prime example of this is emotions. You have likely heard of personification, the act of giving inanimate objects humanlike characteristics. Personification of automation has become so widespread and engrained in our minds that the general population has come to truly believe that automation could have human characteristics that not only have never been shown (not even the slightest) but also that likely will never be able to have.
You may remember some of the more famous examples of pop-culture automation, such as HAL 9000 from Space Odessey, any of the Terminators from The Terminator franchise, or the Decepticons from Transformers. What is shared amongst all of these is that they are highly advanced computer-based systems that are intelligent enough to show characteristics of human behaviour. However, they also all share the commonality that none of them address how a computer could even come to experience that level of intelligence or emotion. Thankfully because of the nature of fictional stories, this concern is something that us viewers can easily suspend in favour of enjoying a story, but allowing ourselves to believe that something may be true does not innately inherit the property of being possible, to begin with.
To better emphasise what I mean, let’s take trust as an example. We are all familiar with what it means to trust someone and what it feels like to have that trust broken or rebuild that trust or to have someone trust you. But if you had to, how would you express the feeling of trusting someone? There are many definitions that you could fill in for what it means to trust someone, but are you really able to capture the essence and feelings of what it means to trust in just some words? Let’s take it a step further. Even if you could perfectly capture (in words) what it means to trust if you had to explain it to someone (or something) that had never experienced the feeling of trust, do you really think they would know what it means to trust? At most, you could likely get someone to understand trust on a conceptual level, but you wouldn’t be able to show them what it actually feels like to trust. The result of this is that we could have a being that understands trust and could potentially exhibit characteristics of trusting others, but this in itself is not trust.
So, what’s the difference? Why can’t automation ever understand what trust is? The likely cause is rooted in biology. The unfortunate reality is that as far as we have come with computer technology, it still all relies on the same fundamental principles. Although we have made machines that can do more and process more, when we look at the lowest level of what it means to be a computer, it’s just a series of ones and zeros being stored in a particular arrangement. Despite all of our technological accomplishments, computers are still highly primitive compared to biology. One of the most distinctive parts are the emotions that humans experience. Although we are able to characterize emotions and show what molecules cause what physiological changes that lead us to feeling happy or sad or angry, there is no computer correlation that captures emotions. We could come up with clever ways to emulate emotion; for example, we could have an anger score, and depending on that value, a computer could make more or less harsh decisions based on that value, but at the core of the system, the automation will still not understand what it means to be angry. Despite exhibiting outward characteristics that we could personify as human emotion, the automation itself is not angry.
This isn’t to say that automation will always be this way. Given the leaps and bounds we’ve seen in technological development, it could be possible that someone creates an entirely new means of computing, one built on the backbone of a more complex system able to ‘create’ emotions. For the time being, though, our primitive technology is not and likely never will be capable of understanding what we feel and thus does not have the capacity to be evil. Only the designers of those systems are capable of giving automation mal intent. Evil robots start with us and the goals we give our systems.