The base size in the calculations in Dapresy is based on the Distinct respondent ID. To be able to handle transactional data (like sales data or other stacked data) we have to support the ability to select another variable to use as base size in the calculation (another numeric or string variable). 


Example: the data file looks like below. The “record” column shows the ID of each client, as shown the data consist of 7 unique clients. One client is shown in multiple rows due to that fact that the client purchased more than one product (one row =one product).


To calculate the needed result (average per client) the user has to be able to select which variable to use as base, in this example the “Distinct Records” variable has to be used as base to get the “Average Sum per client” (see more details in the Calculation chapter).

 

 

The base size option is implemented in Storyteller charts and tables, not in the legacy tools or in the Cross table tool 2.0. 


The new base size option is supported when any of the following calculations are selected:

  • Percentage share
  • Mean
  • Count


The user interface for selecting the base size is straight forward. A dropdown list appears in the Calculation panel as shown in the image below. “RespondentID” shall be the default option in the list.  

The dropdown list contains all the numeric and string variables.

The image below shows the user interface.

 

The base size in the calculation shall be the count of the distinct respondents of the selected base size variable. Below you see example of both the supported calculation types.

 

Example 1: Mean  - Open numeric  

The data looks like below. The user selected to show the Average of the Total sum, “Record” is selected as base variable. 

 

Option 1   - Unweighted data
 
Denominator: The unweighted sum of all cells in the “Total Sum” = 189,82.
 
Numerator: The distinct number of  “records” (COUNT)  =  7
 
Average: 189,82/7= 27,12 

 

Option 2   - Weighted data
 
Denominator: The weighted sum of all cells in the “Total Sum” = 168,3
 
Numerator: The weighted distinct number of “records” * =  6,45
 
Average: 168,3/6,45=26,096

*Each “distinct Record” will consist of multiple Respondents. The weight can be picked from the “first” respondent as we assume that the weight value is the same for all rows of the client.

 

Example 2: %  - Categorical   

The data looks like below. The user selected to show % value of the Manufacturer option Starbucks. Record” is selected as base variable. 

  

Option 1   - Unweighted data
 
Denominator: The unweighted count of Starbucks in the Manufacturer variable. “Starbucks” can maximum be counted once for each Distinct “Record” (if not the result will be wrong) = 5

Numerator: The distinct number of “records”  (COUNT) =  7

Percentage share Starbucks: 5/7*100= 71,43%

 

Option 2   - Weighted data
 
Denominator: The weighted count of Starbucks in the Manufacturer variable. “Starbucks” can maximum be counted once for each Distinct “Record” (if not the result will be wrong)* = 5 ,2

Numerator: The weighted distinct number of “records” * =  6,45

Percentage share Starbucks: 5,2/6,45*100= 80,62%

 

 *Each “distinct Record” will consist of multiple Respondents. The weight can be picked from the “first” respondent as we assume that the weight value is the same for all rows of the client.