How to randomize question order judiciously🔗

When you're researching a set of questions and you want to make sure there aren't biases based on the order in which questions are presented, we recommend randomizing the questions, but presenting questions of a similar type together (rather than mixing all different question sections up and presenting all questions in a completely random order).

For example, if the goal of the study is to see how different worldviews correlate with other factors, and if you were concerned that the order in which worldviews were presented could affect the results, then you could randomize the order in which those worldview questions were presented.

There are a couple of different ways you could randomize question order:

  • You could use the *randomize keyword (as in the first two examples below), or...
  • You could create a collection of questions to be asked in a loop and use the .shuffle function to randomize the order of that collection of questions. (See good example #3 below.)

Example of how to use the *randomize and *group keywords🔗

>> agreementScale = [["Totally agree", 3], ["Agree", 2], ["Somewhat agree", 1], ["Neither agree nor disagree", 0], ["Somewhat disagree", -1], ["Disagree", -2],	["Totally disagree", -3]]

--personality
*randomize: all
	*group
		*question: I am an organized person.
			*answers: agreementScale

	*group
		*question: I am always on time.
			*answers: agreementScale

	*group
		*question: I keep my home clean.
			*answers: agreementScale

--demographics
*randomize: all
	*group
		*question: How old are you?
			*type: number

	*group
		*question: How many children do you have?
			*type: number

	*group
		*question: How many years have you lived in your current home?
			*type: number

In this example, note that the *group keyword isn't necessary for the randomization to occur, but it can be useful because: (1) if you later decided that the randomization needs to apply to more than one line of code (e.g., if it needed to apply to more than one *question at a time), then the *group keyword keeps lines of code together and ensures that a participant who is randomized to the first line is randomized to all the lines of code in that group; and (2) the *group keyword also allows you to name the groups (e.g., *group: groupName), which can be useful when analyzing the data later. Please see this section of the manual for more information.

Example of how to use the *randomize and *group keywords, and how to name groups🔗

>> questionA = "How much do you care about reducing poverty?"
>> questionB = "How much money (in whole USD) are you planning to donate to effective poverty-reducing charities this year?"

*randomize
	*name: questionOrder

	*group: questionAFirst
		*question: {questionA}
		*question: {questionB}

	*group: questionBFirst
		*question: {questionB}
		*question: {questionA}

In this example, when viewing the data CSV, you'll see a column called "Randomize (questionOrder)." Beneath this column, each run will have an entry of "questionAFirst" or "questionBFirst" depending on the group to which the user of that run was randomly assigned.

Example of how to use the .shuffle function to display questions in randomized order🔗

------------
-- CAREER --
------------

-- ask random career questions
>> careerQuestions = ["What was the earliest career plan you can remember forming?", "Where do you see yourself 5 years from now?", "Where do you see yourself 10 years from now?", "What information would be most likely to convince you to change your current career plans, and what would you need to do to obtain that information?", "Which problems would you want to work on in your career, if you could work on anything?"]
>> careerQuestions.shuffle
>> careerAnswers = {}

*for: questionToAsk in careerQuestions

	*question: {questionToAsk}
		*save: answer

	>> careerAnswers[questionToAsk] = answer

-------------
-- FRIENDS --
-------------

-- ask random friend questions
>> friendQuestions = ["Do you listen attentively when your friends share their plans, worries, or problems with you?", "How do you show your friends respect?", "How do you comfort friends when they're distressed?"]
>> friendQuestions.shuffle
>> friendAnswers = {}

*for: questionToAsk in friendQuestions

	*question: {questionToAsk}
		*save: answer

	>> friendAnswers[questionToAsk] = answer

Note: Be aware that there's one extremely significant drawback of this approach: in the CSV generated from the above example, there will basically only be two relevant columns of data! Those two columns will be "careerAnswers" and "friendAnswers". The values in these columns will be the associations careerAnswers and friendAnswers in JSON format (i.e., using {"key": "value"} notation instead of GT's {"key" -> "value"} notation). In other words, there will be an association for each participant in those columns. The keys of each association will be the questions, and the values will be the participant's answers.

Now, if you're a programmer or data scientist, you may have no trouble writing code outside the context of GuidedTrack that reads in the CSV file and extracts the key-value pairs from those JSON objects. But if you're planning to use something like Excel to analyze your data, then (barring some very impressive Excel skills) your data will be very hard to analyze while trapped inside these objects.

So, in summary, if you're planning on using Excel or some other high-level, automated tool to analyze your data, we strongly recommend that you do not use this pattern! On the other hand, if you're a programmer or data scientist who's comfortable writing code to extract data from JSON objects, then the above pattern of iterating over a shuffled collection of questions does have the benefit of compacting your code quite a lot and keeping it DRY, especially as the number of questions grows large.

Not so great example: mixing too many topics together in the same set of randomized questions🔗

-- ask random questions
>> questions = ["How much do you like the color red?", "How much do you like the color blue?", "Do you eat a healthy breakfast in the morning?", "As a child, did you pay attention in school?", "As a child, did you have a lot of toys?"]
>> questions.shuffle
>> answers = {}

*for: questionToAsk in questions

	*question: {questionToAsk}
		*save: answer

	>> answers[questionToAsk] = answer

In the above example, questions are randomized without remaining within a group of questions of a similar topic. Depending on how different the topics are to each other, this could slow participants down (due to repeatedly having to switch from thinking about one topic to thinking about a different one as the participant progresses from one question to another).

Unless you have a specific reason to examine the effects of randomizing the presentation order of questions across different topics in this way, we recommend presenting questions in sets that are grouped according to topic. An exception would be when you wish to check that participants are giving consistent/honest answers. In that scenario, you may wish to purposely mix questions from different topics together (and to have more of a gap between questions/sections that ask about a similar concept) in order to identify those participants who give contradictory answers to similar questions in different parts of the survey.


Next: