Sunday, December 14, 2014

Analyzing Qualitative Data

The Headline Method of Analyzing Qualitative Data
R. A. McWilliam
Siskin Center for Child and Family Research
December 2014
Special-education and psychology researchers still often emerge from their doctoral programs with little to no training in qualitative research. This document is a guide for analyzing qualitative data—data consisting primarily of  words as opposed to numbers. I developed the headline method in part out of frustration with reviewing scores of manuscripts submitted for publication that had rich information but resulted in insipid so-called findings. These findings have often been called themes. A red flag was always when the results were reported as “four themes emerged.” Churchill once said, “Out of intense complexities intense simplicities emerge.” Although he was referring to the fact that difficult things can have simple solutions, it reminds me of weak conclusions apparently emerging from interesting words.
Electronic qualitative-analysis programs have probably been partial culprits in this content-analysis approach to analysis. Such programs are excellent for tagging data (like coding), for identifying the locations of data bits, and for synthesizing the most common words or phrases, whether at the raw-data level or at the data of codes or categories. For example, from a data set of early interventionists’ perceptions of a change in model and philosophy, it’s unsatisfactory to read that the themes of professionalism, identity, views of families, and training emerged. What about these so-called themes? To make matters worse, the results sometimes provide subthemes that are simply more topics. The headline method provides directional findings or hypotheses as alternatives to categorical findings or themes. According to grounded theory, hypotheses can be derived from the data and the analyses (Strauss, 1987).
The headline method is so named because the critical step in the analysis is proposing headlines or hypotheses. According to grounded theory, we are usually not testing hypotheses in qualitative research. In the headline method, we dissect the data to tag them and to help us become familiar with them. That familiarity leads us to arrive at potential conclusions about phenomena under study. We word those conclusions as hypotheses. We then go back to the data to see whether they really support these hypotheses. We edit the hypotheses as necessary, continuously returning to the data, which is called recursivity (LeCompte & Preissle, 1994) or constant comparison (Glaser & Strauss, 2009).


The nine steps in using the headline method are as follows:
1.      Organize the data
a.      Identify data sections
b.      Identify data bits
2.      Code the data bits
3.      Categorize the codes
4.      Propose headlines
5.      Build confirming and disconfirming tables
6.      Edit the headlines
7.      More tables as necessary
8.      Secure agreement
9.      Present findings
This guide begins from the point where the researcher begins to have data. In qualitative research, the researcher doesn’t have to wait until all the data are in.


Four types of data are typically what will be analyzed. Transcripts are verbatim written records of oral discourse, which could be individual or group interviews (i.e., focus group discussions). Tabled notes are summaries of what people said, organized into conceptual spaces, such as tables. The organization of these tables is decided a priori, which is a potential limitation of this method. An advantage is that the data are organized from the beginning. Prominent qualitative analysts have described matrices and networks as useful display formats (Miles, Huberman, & Saldaña, 2013): “A matrix is essentially the ‘intersection’ of two lists, set up as rows and columns” (p. 109); therefore, it is what I have described as a table. Although Miles et al. describe matrices primarily as display options, they can also be used in entering data. In two studies, I’ve recorded the interview and, on playing each interview back, written summary statements, including quotations, on the table, so the information is sorted. Field notes are “narrative descriptions of people, places, human and natural events, patterns of interaction, statements of value and belief, and the historical context in which the preceding take place” (LeCompte & Preissle, 1994) (p. 8). Like transcripts, they are rarely conceptually organized, requiring some treatment in analysis. Finally, documents are data types that, in my field, are likely to be related to an individual child (e.g., an individual education program), a family (e.g., an ecomap), an agency (e.g., a brochure), or a law or rule (e.g., a policy).

Identifying Data Sections

Before coding data bits, data sections need to be identified. Data sections are usually entries pertaining to one event, such as an interview, an observation, or a document. Each entry (i.e., data section) usually has a date, an identifier for the person, place, or document.  The purposes of identifying data sections are retrieval and analyzing the diversity of sources of data: If many of the same opinions come from one person, the researcher needs to know that. The data section identifier would let the researcher know that.

Data Bits

Data bits are single ideas that can receive one or more codes. They can be single words, phrases, sentences, or even paragraphs, depending on how molecular the coding is. It is important to identify data bits in case researchers want to check intercoder agreement at the coding level. To ensure that the same data are being coded independently, each coder has to know what the data bits are. It is possible to identify data bits while doing first-pass coding.

First-Pass Coding

In grounded research, the investigator does not analyze data with specific codes in mind… theoretically. In actuality, researchers begin analysis with theories that have driven the research and with previous experiences, including knowledge of the literature, that make certain codes likely to be used. For example, when I approach field notes of observations made in young children’s classrooms, I will always notice and therefore code instances of child engagement. I believe engagement is the key to learning and I have been studying it for 30 years. It isn’t all I notice. But it is naïve to think that qualitative analysts are using a tabula rasa in developing first-pass codes.
The researcher assigns one- or two-word codes to data bits. As more instances of the same concept occur, the researcher might use the same codes, thereby reducing the number of codes that will need to be categorized in the next step. But these emerging codes should not constrain researchers; they should not apply codes that don’t fit well just because the codes have been used previously. Researchers with a good vocabulary have an advantage because they can use different words for similar but slightly different concepts.
Recursivity is important in data analysis so ideas researchers form later in the process can be used with data they coded earlier. For example, a researcher might see “the teacher glanced at Tony and seemed about to say something but then turned back to Norah.” This data bit was first coded as missed opportunity. But, later in the process, the researcher had come across data bits that were coded selective reinforcement. On returning to this data bit, the researcher changed the former code to the latter. Once all the data have been coded, with the researcher going back through the data to do this recursive coding, it is time to categorize the codes.
If one is not using a qualitative coding program, the researcher can use Word. Some researchers put codes into comments; others use bookmarks. I prefer to put the codes into the text, at the end of data bits, in all caps. Sometimes, I highlight them. A disadvantage of analyzing in Word, is that one cannot search across documents. So, at the point where I want to retrieve information, using the search function, I put all the documents together into one large document.

Second-Pass Categorizing

This step is the closest to the theme approach to qualitative data analysis there is. Qualitative software packages can help with categorizing codes, although I still prefer to do it in Word. One can write all the codes down and look for patterns, such as codes related to the same idea or to contrasting ideas. Codes can be put into networks or concept maps; I use Cmap Tools. What we want to end up with is a list of categories that express concepts—like metacodes. Inside each category are various codes. Some codes can belong to more than one category. For example, the code integrated therapy can belong to the category inclusion and the category service delivery.
Some codes might not end up in any category. They are usually codes that don’t appear very often, even in conjunction with similar-concept codes.
Researchers should then try linking categories that are related to each other. This will help establish linkages and possibly lead to headlines.
It is sometimes helpful to return to the data and assign every data bit to one or more categories. This can be very useful in when building confirming and disconfirming tables. To do this, the researcher goes through the data, entering the category name at the end of each data bit. This can be done on the version with the code already in the data set or it can be in a copy of the uncoded data set, where the data bits are identified. I prefer to add categories to codes, so I can see everything, even though the data become busy with the original narrative, codes, and categories.
But, if the researchers consider they have stayed close to the data, they can proceed to this organization of the categories without a recursive coding step. Will return to the data bits in Step 5.


The categories are organized, and linkages are drawn. The researcher is ready now to propose headlines, so-called to emphasize that they should provide a hypothesis, a potential finding, a story. For example, “Teachers rarely set up activities in advance.” This headline screams at the reader, we hope. It is more interesting and, importantly, more verifiable than “A theme related to set up emerged from the data.” Or “Frequency of set-up.”
Headlines should be in the active voice and should not have too many qualifiers that render the headline nondirectional or wishy-washy. There’s plenty of time for that to happen. For example, “Families are confused by the IFSP development process” is a good, clear headline, compared to “Some families are sometimes confused by the IFSP process.” We can more easily look for confirming and disconfirming evidence of the former headline than the latter. As a result of checking back through the data, we might have to end up with the wishy-washy version, but not at the beginning.

Confirming and Disconfirming Tables

Building these tables is the most important recursive step in the headline method and is the major verification process. For each headline, the researcher builds a table with confirming data bits and disconfirming data bits. To look through the data, the researcher uses the find function in Word, looking for relevant categories or even codes. The researcher isn’t limited to these data bits, but searching can make the process more efficient than reading through all the data again. The relevant data bits are copied and pasted into the tables, along with the identifiers for the data section, so the researcher can look down through all the confirming data. The data section identifiers are needed to see if the data in the tables belong to different informants or the same ones.

Editing the Headlines

Now that the researchers can see the evidence supporting or not supporting the headline, they change the headline, if necessary. The change might be to eliminate it altogether. Another change might be to alter the wording to reflect the nuances that became apparent when the confirming or disconfirming evidence was listed. As mentioned earlier, if the headline is mostly correct but there are enough instances of disconfirming evidence, the strongly worded headline might be tempered with suitable adjectives or adverbs. In a recent study (not yet published) on integrated therapy, we had a headline that began as “In therapy sessions, the child was alone with the therapist, with no other children around.” After building a confirming and disconfirming table, we changed it to “In most therapy sessions, other children were not involved.” In this study, as a result of the recursivity inherent in building these tables, we discovered two more headlines than we originally had.

More Tables

If a headline is altered significantly, a new confirming and disconfirming table has to be built. For example, in another study, we began with this headline: “Children with autism remain in an unsocial state despite social initiations by others.” The confirming table had only three data bits. We reconsidered the hypothesis and changed it to “Children with autism inconsistently respond to social bids by others.” This required another look at the data to build a new confirming and disconfirming table. This time, we found enough data bits to support the statement and very few data bits to refute it, so the new version remained a finding.


Because the “instrument” in qualitative research is the researcher, as opposed to a tool, the concept of interobserver agreement is less relevant than in quantitative research. In quantitative research, one of the indices of the reliability of the scores is the extent to which two people independently using the tool would produce similar scores. It is the scores and, by extension, the tools that are being judged by the reliability estimate. In qualitative research, one would not expect two people necessarily to agree, because each person has his or her own history, background, knowledge, opinions, and so on.
On the other hand, questioning whether the reading or listening of the narrative would generalize to another researcher is reasonable. The idiosyncratic-researcher argument for not attending to interobserver agreement breaks down if the researcher either has some bizarre views on the phenomenon under study or has failed to describe his or her own background, so the reader knows about the lens through which the data are seen.
As soon as we code narrative data, the opportunity for interreader agreement presents itself. In this approach, we have three levels at which agreement can be determined.


Although the list of codes is iteratively constructed by the primary researcher, it would be an unreasonable standard to expect a second person’s iteratively constructed list to be the same. So, in this approach, the first researcher presents the list of codes to the second reader. This list can have definitions, including some examples, but not too many. Too many would obviate the test of agreement. The first researcher should also mark the data bits on the narrative.
The second reader determines which code to apply to each databit. If the first researcher applies two codes to a databit, and the second reader applies only one, but it was one that agreed with the first researchers’ codes, that counts as an agreement, even though the second reader did not apply the second code. Some researchers prefer only one code per databit, to help with interreader agreement, but I prefer to err on the side of nonmutually exclusive codes. The goal is for 85% agreement on the coding of databits.


It is also possible to establish agreement on categories, instead of on codes. The rule in inter-“rater” agreement is that it should be established at the level at which the data are reported. For example, if you code behavior in a single-subject study and report the frequency of those codes, the interobserver agreement needs to be at the code level. If you collapse some codes into bigger categories, in a manner similar to what I have described here, for qualitative analysis, and report the findings at the level of categories, not codes, interobserver agreement is at the level of categories, not codes. Therefore, in qualitative analysis, the first coder codes the narrative data then collapses those into fewer categories. The second reader uses only the list of categories to demarcate each databit. Again, agreement should be 85%.


Agreement at the level of headlines is not quite the same, because databits are not examined. Instead, the second reader examines the confirming and disconfirming tables for the findings (i.e., headlines) and informs the first analyst whether any examples do not fit in the columns in which they were listed. A more rigorous type of inter-“rater” analysis is for the second reader to be given the hypothesis (i.e., finding or headline) with instructions to go through all the data and complete confirming and disconfirming tables. The expected agreement should be an approximately equal ratio of confirming to disconfirming examples. For instance, if Reader 1 found a 10:1 ratio of confirming to disconfirming examples, and Reader 2 found a 12:1 ratio, the agreement would be considered 83% (10/12). Agreement > 80% is considered good with this calculation. Analysts have, therefore, the choice of reviewing the first reader’s confirming and disconfirming tables or completing new ones, independently.

Member Check

The member check is another test of veracity. Information is returned to participants to secure their agreement. This member check is best done at the level of findings (i.e., headlines, hypotheses). If a case-study method is being employed, the researcher can send back either the findings for the individual case or the findings for the whole group of cases (e.g., the final conclusions). “Members” are asked to comment on whether the findings seem reasonable to them. Researchers can ignore the feedback, make some adjustments to the findings, or overhaul them.

Presentation of Findings

One of the most common and irritating ways of presenting qualitative results is through a “garden path analysis” (Bazeley, 2009), in which “a thematic ‘analysis’ can take the reader along a pleasant pathway that leads nowhere: ‘Here are the roses, there are the jonquils, and aren’t the daffodils lovely today!’” (p. 9). The method described here allows the researcher to state actual findings, which become the first structure for presentation, such as in a research report or manuscript.


The headline is presented, explained, and supported with some examples—not too many. You’d be surprised how many readers skip over the examples.

Examples of Analyses

In the Method, not Results, section, an example of a confirming and disconfirming table should be given. If the number of examples is too large, some representative examples are listed.

Linkages and Patterns

After the headlines have been described, the researcher looks for linkages and patterns among the headlines. Some potentially causal relationships might be found. For example, if one headline is Teachers recently graduated understand the concept of engagement better than do experienced teachers and another is Teachers who talk much about engagement have classrooms where children are active learners, one might hypothesize that younger teachers have more active learners, not because of their youth (alone) but because they focus on engagement. Another linkage might be conceptual. For example, one headline might be Home visits where numerous families are present are more fun and another might be Talkative mothers make home visits easy. The link between both is features affecting the home visit atmosphere. Many types of linkages and patterns are possible, and it is far preferable to examine these linkages than just to list findings—and we should avoid listing “themes.”
This document has described an efficient, grounded approach to analyzing qualitative data. It avoids the garden path problem and it leads to further research, because every finding is a hypothesis. The method involves coding, categorizing, establishing headlines, confirming or editing them, determining agreement among researchers, and presenting findings.