Miller High Life can in focus against a bokeh backdrop of an Eagles game at Lincoln Financial Field in Philadelphia

Slicing and Dicing Data Model Work-to-be-Done

In my last post, I shared a bit of background on the Casey-Fink Graduate Nursing Experience Survey, which I’m using in an ongoing case-study on how to approach analytics strategy. Today’s post will get further into the details of how to set up the work-to-be-done, in terms of data modeling.

Start with Known Unknowns

Miller High Life can in focus against a bokeh backdrop of an Eagles game at Lincoln Financial Field in Philadelphia
In my last post, I wrote briefly about getting all the caveats and obstacles out of the way first so while it’s acknowledged, it’s more bokeh — taking the focus… so stay focused!

When I look at the Casey-Fink survey, I think about the survey itself in two ways:

  1. Standards: There are only three different types of questions that a nurse is asked in the survey.
  2. Product: There are semantic mappings that can connect a survey item to other content that could be offered as targeted remediation on an assertion of need by a nurse.

I don’t know enough *yet* to get moving on no. 2, but plenty is known about the question types, so as I translate the analysis of work to be done into JIRA stories, the first slice is going to focus on Casey-Fink Survey Item Data Modeling for cmi.interactions.

When YOU look at the Casey-Fink survey, you should see that of all the questions in the survey and the demographics, there are really only three different types of questions being asked:

  • Likert questions — the questions where one responds with a choice in a “range” of choices
  • Choice questions — like straight up multiple-choice questions, not offered in a continuous range like likerts
  • Fill-In questions — like the one open-response question shown at the end of the survey.

These three types of questions have established norms for how they should be modeled as xAPI statements, which can be found canonically within Appendix C of the xAPI spec.

This means the first slice at data modeling is pretty straightforward. For example, the first set of likert questions in the Casey-Fink survey will likely have a data model that looks, in part, like this:

"definition": {
	"description": {
		"en-US": "I feel confident communicating with physicians and other providers."
	},
	"type": "http://adlnet.gov/expapi/activities/cmi.interaction",
	"interactionType": "likert",
	"scale": [
		{
			"id": "likert_0", 
			"description": {
				"en-US": "Strongly Disagree"
			}
		},
		{
			"id": "likert_1", 
			"description": {
				"en-US": "Disagree"
			}
		},
		{
			"id": "likert_2", 
			"description": {
				"en-US": "Agree"
			}
		},
		{
			"id": "likert_3", 
			"description": {
				"en-US": "Strongly Agree"
			}
		}
	]
}

Now, I’ve kinda did some of the likert data modeling already above here. If you’re playing along, look to the templates from the xAPI spec and transpose them for your version of the survey so the team has a document of what these statements, at least to conform to the standard conventions for expressing these interactions.

In a forthcoming post, we’ll get to the second and far more interesting part of this work — the data modeling for linking to other content and competency information!

Related