Reflections on the written word: JOURNALISM

Nov 24, 2014

The power of good visualizations

The power of good visualizations that's the topic of this week. Why are visualizations powerful? How can I make my visualization powerful? Those questions and other will be answered in this blog pot.

Last week I already talked about how to find data and the importance of a good dataset. This week I’m going to talk about how you can make that dataset come to live. Visualizations are a great way to make your story more comprehensible and to make it easier for readers to extract meaning from your dataset.

Why should you use visualizations?

As Alberto Cairo, professor of the professional practice at the School of Communication of the University of Miami, author of the book ‘The functional art: An introduction to information graphics and visualization and an instructor of the free online data driven journalism course, describes a visualization is a graphical representation of evidence. He explains that we use graphs and maps because in many cases they are the only way in which we are able to extract meaning. A spreadsheet is just a set of numbers, where you can only see the actual figures. But if you transform that set of numbers into a graph or visual, our brain can extract meaning and patterns from that data. To show you how powerful visualizations can be, Cairo used this map as an example:

Source

This map represents the elections in Ukraine. The different colors on the map represent what political party got more votes in the parliamentary elections in 2012. The blue circles represent where the Party of Regions, the party of the current president, has won. The orange circles represent where the Fatherland Party, the opposition party that is pro-Western, has won. The size of the circles represents the difference in votes in favor of the party that won in a particular district. Even if you don’t speak Ukrainian, you are able to extract the main meaning of this map. You can see that Ukraine is divided. You see that the Western part of the country votes more for the opposition party and the Eastern part votes more for the party of the current president. When you just look at a dataset, you would not be able to see this pattern. Thus, you would not have been able to extract the meaning of the data. And that is exactly why visualizations are so powerful.

Importance of good visualizations and things to avoid

Now that you know visualization are powerful and can make your story more comprehensible, it is important to be critical as to what visualizations you use. Your visualization should be clear, concise, not too complicated and should help your readers see the differences you want to show. When you use a good visualization, it can be a great addition to your story. It helps bring your story to life and it makes it clear what you are talking about. When you use a visualization in the right way, it can also make your story more interesting. However, if you use a visualization the wrong way, it can affect your readers’ interest in your story and your credibility as a writer. Bad visualizations might help you attract readers, but it is not smart to use them. If your readers find out that your display of the data is incorrect, they will distrust everything you say in your story and might not even be interested to read further. Mistakes can happen. So if you accidentally use a bad visualization only once, it probably won’t matter as much. However, if you start using them more often, it will affect your credibility. But what makes a visualization bad? And what should you thus avoid? To answer this question, I will show you some examples of bad visualizations and how to use them in a better way.

This first example below of the bad visualization is quite obvious. As you can see, the height of the bars in the first visualization doesn't match the numbers that are displayed. Instead of using this bad graph, you should be using the second graph. You can see that in this revised graph the numbers match the height of the bars.

Source

Source

In the second example you can see that in the first graph, the numbers on the vertical axis don’t start at zero. There are indeed some differences between those zeven months but the differences are not as big as they seem in this graph. If the y-axis would have started at zero, you would have been able to see that more clearly. Instead of using the first graph, you should use the second graph where you can see the actual differences. However, it should be noted that it is not always bad to start the vertical axis somewhere above zero. You may have noticed that the vertical axis in the first example doesn’t start at zero either. However, in that case the differences in heights of the bars represent a large difference in the amount of money that the government has spent. If the axis would have started at zero, you would have believed the differences weren’t as big. So keep in mind what your graph is representing and what is the most adequate way of displaying your data.

Source

In the last example below you can see that the visualization on the left provides you with a lot of information. The first map in the first visualization shows you the unemployment rate in the Netherlands from the January 2008 until January 2013. The second map shows you the rate from January 2012 until January 2013. This visualization is supposed to show you how the unemployment rate has changed over the last five years in comparison to the last full year. However, in this display of the data it is very hard to extract the real meaning of this map. It is difficult to see the differences between those two maps. Instead of trying to display all this data at once, you should use a map that is more concise. In this case, it would be better if you just showed the second map, like shown in the second visualization. If you still would like to show the development over the past five years, it would be wiser to use a different kind of visualization.

Source

What are the key elements to a good visualization?

So what makes a visualization good? Cairo states that there are four features that define a good visualization. Your visualization should be functional, beautiful, insightful and enlightening. The shape of the graphic should match the questions that the visualization answers. The visualizations should also be attractive and insightful, so that readers will be eager to read your story. Besides that, the information displayed in the visualization should also shape the perception of the reader. According to Cairo, the three rules that you need to keep in mind in order to portray these features are as follows:

Think about your audience and the publication.
Think about the questions your visualization should answer
You should be able to understand the visual without reading every number

In my opinion and as you have seen in the examples above, there are some other important and more concrete factors to pay attention to. You need to make sure that the numbers you show match with what you display in your graph. Moreover, pay attention to the y-axis and the x-axis and try to keep your visualization clear and concise. Don’t try to display too much information at once. And also, keep in mind what you want your visualization to represent. Like Cairo stated, it is important to think about the questions your visualization answers. That can mean that in some cases it is more logical to break a few rules in order to display your data in a more adequate way.

Conclusion

Visualizations can be a powerful and quiet useful addition to your story, provided that your visualization is correctly used. You have read above why it is important to use good visualizations. Furthermore, the examples above and the rules of Cairo have shown you what you need to think about before you create or use a certain visualization. Think about what you want to show and what message you are trying to deliver. And like always, be critical!

Nov 20, 2014

How to find the data you’re looking for

Follow my blog with Bloglovin

One particular field of journalism is data journalism. Simon Rogers, Data Editor at Twitter, former editor of the Guardian’s award-winning Datablog and an instructor of the free online data driven journalism course, describes data journalism as a way of telling stories by using numbers. It brings stories that are in the public eye to life by showing the numbers behind the news. The data can be accompanied by visualizations, but they are only there in service of the story.

For as long as journalism has existed, the reporting of data has played a role as well. In the olden days data was often collected by using a notebook and a cassette recorder and journalists often had to rely solely on the research and analysis performed by statisticians. Over the years the techniques of data journalism have changed. Journalists have had much easier access to tools that help them gather data, such as Excel and Numbers, and easier access to tools that help visualize their data. In the digital age we now live in there has also been a wider spread of open data. Governments and other organizations that collect statistics around the world are publishing thousands of databases online, which has made it both easier and harder at the same time for journalists to find the data they are looking for. The search for data has become easier, because journalists can now browse through the Internet and search for the information they need. However, since so many datasets are now available to journalists and the public in general, it is also more difficult the find the ‘perfect’ dataset. What I mean by the ‘perfect’ dataset is a dataset that not only offers you the data you’re looking for to accompany your story, but that is also valid. This blog post will offer you, as journalists, guidelines on how to find this ‘perfect’ dataset yourselves.

Source

How can you get data to support your story?

Quote: “Data journalism begins in one of two ways: either you have a question that needs data, or a dataset that needs questioning. Whichever it is, the compilation of data is what defines it as an act of data journalism”. – Paul Bradshaw

Paul Bradshaw is the Head of the Online Journalism MA at Birmingham City University, Visiting Professor at City University’s School of Journalism in London and also an instructor for the online data journalism course. His quote shows that a story can be either based on a question for which you need to search data or on a dataset which raises an interesting question that needs to be sorted out. This blog post will be focused on the first situation. You have a certain topic in mind and are looking for data to accompany your story. The first thing to do is ask yourself ‘What kind of data am I looking for?’. When you know what you are looking for, you can start searching for the data.

Where can you find data?
Of course you can collect your data by doing your own research, but in most cases you will probably not have the time or money to do that. Therefore, a quicker way to gather data would be to look for it online. As I have mentioned before, more organizations are publishing their data online. You can, for instance, go to a government website or the website of a national statistical service and find all sorts of data there. On this Wikipedia page you can find a list of national and international statistical services you could use to gather data. You can also look for information on the websites of international bodies, e.g. the website of the World Health Organization, the United Nations, the World Bank or the European Union.

How do you know if your dataset is valid?
When you have found a dataset, you need to make sure that the data is valid and does indeed support your story. So how do you know if your data is trustworthy? Rogers states that when you are relying on data that is collected by someone else, you need to check who collected it and when and how it was collected. Get in touch with the person who collected the data and ask them about it. Besides that, also try to find another source that has the same kind of data and compare that dataset with the one you found. These two steps are very important to determine whether your data is valid or not. Take for instance this example as described by TechTarget, that shows how the analysis of big data projects can go wrong. In this project researchers wanted to use Twitter feeds and other social media to predict the unemployment rate in the United States. They looked for words that pertained to unemployment, e.g. jobs, unemployment and classifieds, in tweets and posts on other social media. After that they looked for correlations between the number of words per month in this category and the unemployment rate of that month. During the project there was a sudden increase in the word count, so the researchers believed they were on to something. However, what they failed to notice was that Steve Jobs died in that same period they found an increase. Therefore, the number of tweets with ‘jobs’ in them were of course higher but not related to unemployment. If the researchers had looked more closely at what was happening during the time of their research, they would have known that the increase in words was unrelated to the unemployment rate. So it is important for you, as a journalist, to be aware that not all research is accurate and trustworthy. If you look for another dataset that says the same thing, the chances that you found a good, trustworthy dataset are higher. Furthermore, you need to be aware of how you interpret the data. Most mistakes about false data analysis are made by interpreting the data wrong. Look carefully at what the data is actually saying and not just at what you want or believe it is saying.

The five W's: who, what, where, when and why

Source

Keep the five W’s in mind
The most important things to remember when you are trying to see if your dataset is valid are the five W’s, as described by Simon Rogers. Ask yourself these questions before you use the dataset you found.

Who: Where did the data come from?
What: What are you trying to say with your data?
When: How old is your data?
Where: Which situation is described by the collected data? An essential part of data journalism is to combine different datasets and create a new story. Simon Rogers has, for instance, combined the gun ownership and homicides over the world and made one supporting visual out of it.
Why: Why is the data you found interesting and what does the data mean?

Conclusion
In conclusion, the ‘perfect’ dataset will offer you the data you are looking for, that can accompany your story and that is also valid. This blog post has showed you how to find this dataset and how to determine if that dataset is valid. To summarize, you need to check who collected the data you found and when and how it was collected. Get in touch with that person and ask them questions about their data. When you found a dataset that could support your story, be aware that not all data is accurate and trustworthy. Try to look for another source with the same kind of data. The chances that your dataset is trustworthy are higher when you have another source that says the same thing. If you want your data to be valid, always keep the five W’s in mind. The five W’s offer you guidelines that can help determine whether the data can be trusted or not.

Translate

Nov 24, 2014

The power of good visualizations

Source

Source

Nov 20, 2014

How to find the data you’re looking for

Source

Source