Writing About Data

When it comes to writing about data, language can become a bit tricky. Conversational/casual language like “kind of” and “should” must be avoided and be replaced by definitive statements. This means phrases such as “indicates” and “states” are to be used instead. These words are far more certain in their definition and have less suggestive connotations attributed to them.

To illustrate this point I’ve compiled several images/graphics pertaining to income distribution of households by county in New Mexico and will give examples of describing data.

If we were talking about data clusters, a statement like this would be made: “It can be observed that median household income by county over $55,625 exists in two clusters: one by the capital of New Mexico, Santa Fe, and the other by the border of Texas and New Mexico.” This bit clearly delineates what is being studied and where. This is necessary when talking about data because it draws the reader exactly to where you want them to look, and not anywhere else.

“For example, the counties of Eddy and Lea have household earnings averaging $15,000+ more than their neighbors in Otero, Chaves, and Roosevelt counties.” The purpose of this bit is to describe that difference, and bring it into context with numbers. But the number $15,000 is useless on its own. That’s when the word “averaging” comes into play. It puts the difference of 15,000 into context and acknowledges outliers (through the nature of its being). It’s calculated and is applicable in this situation, making it the linchpin when describing the difference in household income earnings.

“This is likely because of the counties of Eddy and Lea are in close proximity to Texan oil fields, where New Mexican employees may commute to from their homes.” While this statement does not discuss data directly like in the previous example, it instead gives reasons for its distributions and clusters. This is an example of indirectly discussing data, but in a proper way. Statements are declarative and are based on solid assumptions due to realities of the county’s physical location.

All of this goes to show how one could go about describing data. Direct discussion is always the most convincing way to get a message across, but indirect statements about the data are also effective. Just always be sure that the phrases you do use to discuss your data are declarative and leave no room for misinterpretation.

Leave a Reply

Your email address will not be published. Required fields are marked *