DesirabilityScores
This package is a port of the desiR R package. It contains functions for ranking, selecting, and integrating data. Main uses to date have been (1) prioritising genes, proteins, and metabolites from high dimensional biology experiments, (2) multivariate hit calling in high-content drug discovery screens, and (3) combining data from diverse sources.
The vignette and publication provide more details.
Source code available on GitHub.
Exported Functions
DesirabilityScores.d_4plDesirabilityScores.d_centralDesirabilityScores.d_endsDesirabilityScores.d_highDesirabilityScores.d_lowDesirabilityScores.d_overallDesirabilityScores.d_rankDesirabilityScores.des_plot
DesirabilityScores.d_4pl — Method
d_4pl(x; hill, inflec, des_min = 0, des_max = 1)Maps a numeric variable to a 0-1 scale with a 4 parameter logistic function.
Arguments
x: Vector of values to map whose elements are a subtype ofReal. Additionally, must be non-negative (the other functions offered here can process negative inputs, butd_4pl(...)can mapxto values outside of the unit interval depending on parameter settings).des_min, des_max: The lower and upper asymptotes of the function. Defaults to zero and one, respectively.hill: Hill coefficient. It controls the steepness and direction of the slope. A value greater than zero has a positive slope and a value less than zero has a negative slope. The higher the absolute value, the steeper the slope.inflec: Inflection point. Is the point on the x-axis where the curvature of the function changes from concave upwards to concave downwards (or vice versa).
Details
This function uses a four parameter logistic model to map a numeric variable onto a 0-1 scale. Whether high or low values are deemed desirable can be controlled with the hill parameter; when hill > 0 high values are desirable and when hill < 0 low values are desirable.
Examples
julia> my_data = [1,3,4,0,2,7,10]
7-element Vector{Int64}:
1
3
4
0
2
7
10
julia> d_4pl(my_data; hill = 1, inflec = 5)
7-element Vector{Float64}:
0.16666666666666663
0.375
0.4444444444444444
0.0
0.2857142857142857
0.5833333333333333
0.6666666666666667DesirabilityScores.d_central — Method
d_central(x, cut1, cut2, cut3, cut4; des_min = 0, des_max = 1, scale = 1)Maps a numeric variable to a 0-1 scale such that values in the middle of the distribution are desirable. Values less than cut1 and greater than cut4 will have a low desirability. Values between cut2 and cut3 will have a high desirability. Values between cut1 and cut2 and between cut3 and cut4 will have intermediate values. This function is useful when extreme values are undesirable. For example, outliers or values outside of allowable ranges. If cut2 and cut3 are close to each other, this function can be used when a target value is desirable.
Arguments
x: Vector of values to map whose elements are a subtype ofReal.cut1,cut2,cut3,cut4: Values of the original data that define where the desirability function changes.des_min, des_max: The lower and upper asymptotes of the function. Defaults to zero and one, respectively.scale: Controls how steeply the function increases or decreases.
Examples
julia> my_data = [1,3,4,0,-2,7,10]
7-element Vector{Int64}:
1
3
4
0
-2
7
10
julia> d_central(my_data, 0, 2, 4, 6; scale = 2)
7-element Vector{Float64}:
0.25
1.0
1.0
0.0
0.0
0.0
0.0DesirabilityScores.d_ends — Method
d_ends(x, cut1, cut2, cut3, cut4; des_min = 0, des_max = 1, scale = 1)Maps a numeric variable to a 0-1 scale such that values at the ends (both high and low) of the distribution are desirable. Values less than cut1 and greater than cut4 will have a high desirability. Values between cut2 and cut3 will have a low desirability. Values between cut1 and cut2 and between cut3 and cut4 will have intermediate values. This function is useful when the data represent differences between groups, where both high and low values are of interest.
Arguments
x: Vector of values to map whose elements are a subtype ofReal.cut1,cut2,cut3,cut4: Values of the original data that define where the desirability function changes.des_min, des_max: The lower and upper asymptotes of the function. Defaults to zero and one, respectively.scale: Controls how steeply the function increases or decreases.
Examples
julia> my_data
7-element Vector{Int64}:
1
3
4
0
2
7
10
julia> d_ends(my_data, 0, 2, 4, 6; scale = .5)
7-element Vector{Float64}:
0.7071067811865476
0.0
0.0
1.0
0.0
1.0
1.0DesirabilityScores.d_high — Method
d_high(x, cut1, cut2; des_min = 0, des_max = 1, scale = 1)Maps a numeric variable to a 0-1 scale such that high values are desirable. Values less than cut1 will have a low desirability. Values greater than cut2 will have a high desirability. Values between cut1 and cut2 will have intermediate values.
Arguments
x: Vector of values to map whose elements are a subtype ofReal.cut1,cut2: Values of the original data that define where the desirability function changes.des_min, des_max: The lower and upper asymptotes of the function. Defaults to zero and one, respectively.scale: Controls how steeply the function increases or decreases.
Examples
julia> my_data = [1,3,4,0,-2,7,10]
7-element Vector{Int64}:
1
3
4
0
-2
7
10
julia> d_high(my_data, 3,5)
7-element Vector{Float64}:
0.0
0.0
0.5
0.0
0.0
1.0
1.0DesirabilityScores.d_low — Method
d_low(x, cut1, cut2; des_min = 0, des_max = 1, scale = 1)Maps a numeric variable to a 0-1 scale such that low values are desirable. Values less than cut1 will have a high desirability. Values greater than cut2 will have a low desirability. Values between cut1 and cut2 will have intermediate values.
Arguments
x: Vector of values to map whose values are a subtype ofReal.cut1,cut2: Values of the original data that define where the desirability function changes.des_min, des_max: The lower and upper asymptotes of the function. Defaults to zero and one, respectively.scale: Controls how steeply the function increases or decreases.
Examples
julia> my_data = [1,3,4,0,-2,7,10]
7-element Vector{Int64}:
1
3
4
0
-2
7
10
julia> d_low(my_data, 3,5; des_min = .25)
7-element Vector{Float64}:
1.0
1.0
0.625
1.0
1.0
0.25
0.25DesirabilityScores.d_overall — Method
d_overall(d; weights = nothing)Combines any number of desirability values into an overall desirability.
Arguments
d: A matrix of desirabilities. Rows are observations and columns are desirabilities. Non-missing values must be a subtype ofReal.weights: Allows some desirabilities to count for more in the overall calculation. Defaults to equal weighting. If specified, must be a non-empty vector with elements a subtype ofReal.
Examples
julia> d1 = d_4pl(my_data; hill = 1, inflec =5)
7-element Vector{Float64}:
0.16666666666666663
0.375
0.4444444444444444
0.0
0.2857142857142857
0.5833333333333333
0.6666666666666667
julia> d2 = d_high(my_other_data, 2, 5)
7-element Vector{Float64}:
0.3333333333333333
0.3333333333333333
0.6666666666666666
0.0
0.6666666666666666
0.0
1.0
julia> d_overall(hcat(d1, d2); weights = [1, 2])
7-element Vector{Union{Missing, Float64}}:
0.2645668419946999
0.3466806371753174
0.5823869764908659
0.0
0.5026316274194359
0.0
0.8735804647362989DesirabilityScores.d_rank — Method
d_rank(x; low_to_high = true, method = :ordinal)Values are ranked from low to high or high to low, and then the ranks are mapped to a 0-1 scale.
Arguments
x: A non-empty vector of values to map. Non-missing elements must be a subtype ofReal.low_to_high: Iftrue, low ranks have high desirabilities; iffalse, high ranks have high desirabilities. Defaults totrue.method: A symbol specifying the method that should be used to rank x. Options include:ordinal,:compete,:dense, and:tied. Note these are the same options offered by ranking functions inStatsBase.jl(which this function uses). See that package's documentation for more details.
Examples
julia> to_rank = [5,10,-4.5, 8, pi, exp(1), -100]
7-element Vector{Float64}:
5.0
10.0
-4.5
8.0
3.141592653589793
2.718281828459045
-100.0
julia> d_rank(to_rank; method = :compete)
7-element Vector{Float64}:
0.3333333333333333
0.0
0.8333333333333334
0.16666666666666666
0.5
0.6666666666666666
1.0DesirabilityScores.des_plot — Method
des_plot(x, y; des_line_col = :black, des_line_width = 3, hist_args...)Plots a histogram and overlays the desirability scores.
Arguments
x: A non-empty vector of values to map. Non-missing elements must be a subtype ofReal. Need not be sorted – this is done before passing to the plotting function. This also means that tuples are not acceptable (since they are immutable).y: A non-empty vector of desirability scores. Need not be sorted, but must be in the proper order with respect to x (i.e., datumx[1]has desirabilityy[1]. As withx, tuples are not acceptable.des_line_col: A string or symbol specifying color of the line.des_line_width: An integer specifying the line width.hist_args...: Additional arguments for thePlot.jl'shistogram()function.
Examples
x = randn(1000)
y = d_high(x, -1, 1; des_min = 0.1, des_max = 0.8, scale = 2)
des_plot(x, y, des_line_col = :orange1; color = :steelblue)References
Farmer P, Bonnefoi H, Becette V, Tubiana-Hulin M, Fumoleau P, Larsimont D, Macgrogan G, Bergh J, Cameron D, Goldstein D, Duss S, Nicoulaz AL, Brisken C, Fiche M, Delorenzi M, Iggo R. Identification of molecular apocrine breast tumours by microarray analysis. Oncogene. 2005 24(29):4660-4671.
Lazic SE (2015). Ranking, selecting, and prioritising genes with desirability functions. PeerJ 3:e1444.