Essays

[Book Review] David Sumpter, Outnumbered - Part 1

wwrww 2023. 4. 1. 23:02

https://www.amazon.com/Outnumbered-Facebook-Filter-bubbles-Algorithms-Control/dp/147294741X

 

Part 1. Algorithms that analyze us

 

I was also an ordinary person who was only afraid of inequality and discrimination occurred by algorithms, which have permeated our daily lives and often replace our choice on behalf of lazy individuals. However, I have worried about issues but have not tried to look deeply into how these algorithms work. First, I thought it was impossible to look into their algorithms because it's their business model. Second, probability and statistics were my weakest part while studying mathematics. So, I have some PTSD in that area. Now let us proceed; this book is excellent for those starting to figure out the underlying logic of algorithms we encounter. The author knows what mathematical concepts were used to build algorithms and reconstruct his own, resembling commercial ones. In this process, we can understand the underlying logic of YouTube, Facebook and Google algorithms.

 

The first part covers algorithms that analyze our preferences, personality and political orientation. The author wanted to answer two main questions: "How can these algorithms analyze us?" and "How much are they influential and correct?"

 

Question #1. What's the method?

 

The primary method that algorithms utilize is Principal Component Analysis (PCA). It is a commonly used mathematical method to figure out influential factors when making judgements. When the number of factors exceeds three, the human brain cannot handle it effectively. This is because we are living in a three-dimensional world. We can draw X, Y plane or X, Y, Z plane, but anything beyond these dimensions is out of our ability. Therefore, we use the PCA method when there are many factors. PCA method receives multiple factors and considers which factors are the most related. By mapping individuals in hundreds of dimensions, this method aims to find the principal element and finally lowers the data dimension. As a result, the PCA method loses some data, which shows a weak relationship but concentrates on solid validity. For example, many elements will build our personality, such as birth order, birth date, sex, gender, job, age, nationality, sexual orientation, etc. Let's say we want to analyze one's personality based on this data, but we are not so sure which factor will affect most and which will not. Then we can match those data to, for example, MBTI (personality type test) results. Then PCA method will calculate which element is vital and contributes a lot when determining one's personality. Then, now we can score weight per data. Finally, we can now predict someone's personality based on personal information.

 

Question #2. Is it correct?

 

Wow, it's creepy because it makes us feel computers pass into us. First, however, we should answer this question. "How much is the result correct?" In other words, we have two different aspects. First, computers do know us well, perhaps more than people know their friends. This is because we already have prejudice when we assess someone's characteristics. Therefore, our prediction is inaccurate or biased. However, computers can calculate hundreds of factors. Also, they are not influenced by existing perceptions. As a result, computers have a chance to analyze us more creatively; they might find personality even a party doesn't know about. 

 

However, this also has limitations. We have limited data. Facebook profile information or our Like gestures are not enough to make meaningful PCA. In other words, PCA is not magic; good results come from good-quality of data. For example, the algorithm will be highly efficient when we offer political orientation - data highly relevant to our personality. However, this is not surprising as if that kind of critical information is provided; humans can also predict others well. To sum up, the author points out that advertisements made by journalists are exaggerated. In other words, algorithms work just as we perform when decent-quality of data are provided. 

 

Algorithms have limitations. How it works and the result they provide is not a threat. The first step is to understand this fact. However, a massive amount of good data will soon be accumulated, and computers will develop more accurate models. Therefore, what we should now think about is how to build these algorithms in an unbiased way. The short answer is that we cannot make fully unbiased algorithms. We should choose between making justice in procedure and justice in results. This means that algorithms cannot stand alone without human interpretation of humanity, culture, law and feelings. Then can we say algorithms rule us?