DesirabilityScores

This package is a port of the desiR R package. It contains functions for ranking, selecting, and integrating data. Main uses to date have been (1) prioritising genes, proteins, and metabolites from high dimensional biology experiments, (2) multivariate hit calling in high-content drug discovery screens, and (3) combining data from diverse sources.

The vignette and publication provide more details.

Source code available on GitHub.

Exported Functions

DesirabilityScores.d_4plMethod
d_4pl(x; hill, inflec, des_min = 0, des_max = 1)

Maps a numeric variable to a 0-1 scale with a 4 parameter logistic function.

Arguments

  • x: Vector of values to map whose elements are a subtype of Real. Additionally, must be non-negative (the other functions offered here can process negative inputs, but d_4pl(...) can map x to values outside of the unit interval depending on parameter settings).

  • des_min, des_max: The lower and upper asymptotes of the function. Defaults to zero and one, respectively.

  • hill: Hill coefficient. It controls the steepness and direction of the slope. A value greater than zero has a positive slope and a value less than zero has a negative slope. The higher the absolute value, the steeper the slope.

  • inflec: Inflection point. Is the point on the x-axis where the curvature of the function changes from concave upwards to concave downwards (or vice versa).

Details

This function uses a four parameter logistic model to map a numeric variable onto a 0-1 scale. Whether high or low values are deemed desirable can be controlled with the hill parameter; when hill > 0 high values are desirable and when hill < 0 low values are desirable.

Examples

julia> my_data = [1,3,4,0,2,7,10] 
7-element Vector{Int64}:
  1
  3
  4
  0
  2
  7
 10

julia> d_4pl(my_data; hill = 1, inflec = 5)
7-element Vector{Float64}:
 0.16666666666666663
 0.375
 0.4444444444444444
 0.0
 0.2857142857142857
 0.5833333333333333
 0.6666666666666667
source
DesirabilityScores.d_centralMethod
d_central(x, cut1, cut2, cut3, cut4; des_min = 0, des_max = 1, scale = 1)

Maps a numeric variable to a 0-1 scale such that values in the middle of the distribution are desirable. Values less than cut1 and greater than cut4 will have a low desirability. Values between cut2 and cut3 will have a high desirability. Values between cut1 and cut2 and between cut3 and cut4 will have intermediate values. This function is useful when extreme values are undesirable. For example, outliers or values outside of allowable ranges. If cut2 and cut3 are close to each other, this function can be used when a target value is desirable.

Arguments

  • x: Vector of values to map whose elements are a subtype of Real.

  • cut1, cut2, cut3, cut4: Values of the original data that define where the desirability function changes.

  • des_min, des_max: The lower and upper asymptotes of the function. Defaults to zero and one, respectively.

  • scale: Controls how steeply the function increases or decreases.

Examples

julia> my_data = [1,3,4,0,-2,7,10]
7-element Vector{Int64}:
  1
  3
  4
  0
 -2
  7
 10

julia> d_central(my_data, 0, 2, 4, 6; scale = 2) 
7-element Vector{Float64}:
 0.25
 1.0
 1.0
 0.0
 0.0
 0.0
 0.0
source
DesirabilityScores.d_endsMethod
d_ends(x, cut1, cut2, cut3, cut4; des_min = 0, des_max = 1, scale = 1)

Maps a numeric variable to a 0-1 scale such that values at the ends (both high and low) of the distribution are desirable. Values less than cut1 and greater than cut4 will have a high desirability. Values between cut2 and cut3 will have a low desirability. Values between cut1 and cut2 and between cut3 and cut4 will have intermediate values. This function is useful when the data represent differences between groups, where both high and low values are of interest.

Arguments

  • x: Vector of values to map whose elements are a subtype of Real.

  • cut1, cut2, cut3, cut4: Values of the original data that define where the desirability function changes.

  • des_min, des_max: The lower and upper asymptotes of the function. Defaults to zero and one, respectively.

  • scale: Controls how steeply the function increases or decreases.

Examples

julia> my_data
7-element Vector{Int64}:
  1
  3
  4
  0
  2
  7
 10

julia> d_ends(my_data, 0, 2, 4, 6; scale = .5) 
7-element Vector{Float64}:
 0.7071067811865476
 0.0
 0.0
 1.0
 0.0
 1.0
 1.0
source
DesirabilityScores.d_highMethod
d_high(x, cut1, cut2; des_min = 0, des_max = 1, scale = 1)

Maps a numeric variable to a 0-1 scale such that high values are desirable. Values less than cut1 will have a low desirability. Values greater than cut2 will have a high desirability. Values between cut1 and cut2 will have intermediate values.

Arguments

  • x: Vector of values to map whose elements are a subtype of Real.

  • cut1, cut2: Values of the original data that define where the desirability function changes.

  • des_min, des_max: The lower and upper asymptotes of the function. Defaults to zero and one, respectively.

  • scale: Controls how steeply the function increases or decreases.

Examples

julia> my_data = [1,3,4,0,-2,7,10]
7-element Vector{Int64}:
  1
  3
  4
  0
 -2
  7
 10

julia> d_high(my_data, 3,5)
7-element Vector{Float64}:
 0.0
 0.0
 0.5
 0.0
 0.0
 1.0
 1.0
source
DesirabilityScores.d_lowMethod
d_low(x, cut1, cut2; des_min = 0, des_max = 1, scale = 1)

Maps a numeric variable to a 0-1 scale such that low values are desirable. Values less than cut1 will have a high desirability. Values greater than cut2 will have a low desirability. Values between cut1 and cut2 will have intermediate values.

Arguments

  • x: Vector of values to map whose values are a subtype of Real.

  • cut1, cut2: Values of the original data that define where the desirability function changes.

  • des_min, des_max: The lower and upper asymptotes of the function. Defaults to zero and one, respectively.

  • scale: Controls how steeply the function increases or decreases.

Examples

julia> my_data = [1,3,4,0,-2,7,10]
7-element Vector{Int64}:
  1
  3
  4
  0
 -2
  7
 10
julia> d_low(my_data, 3,5; des_min = .25)
7-element Vector{Float64}:
 1.0
 1.0
 0.625
 1.0
 1.0
 0.25
 0.25
source
DesirabilityScores.d_overallMethod
d_overall(d; weights = nothing)

Combines any number of desirability values into an overall desirability.

Arguments

  • d: A matrix of desirabilities. Rows are observations and columns are desirabilities. Non-missing values must be a subtype of Real.

  • weights: Allows some desirabilities to count for more in the overall calculation. Defaults to equal weighting. If specified, must be a non-empty vector with elements a subtype of Real.

Examples

julia> d1 = d_4pl(my_data; hill = 1, inflec =5)
7-element Vector{Float64}:
 0.16666666666666663
 0.375
 0.4444444444444444
 0.0
 0.2857142857142857
 0.5833333333333333
 0.6666666666666667

julia> d2 = d_high(my_other_data, 2, 5)
7-element Vector{Float64}:
 0.3333333333333333
 0.3333333333333333
 0.6666666666666666
 0.0
 0.6666666666666666
 0.0
 1.0

julia> d_overall(hcat(d1, d2); weights = [1, 2]) 
7-element Vector{Union{Missing, Float64}}:
 0.2645668419946999
 0.3466806371753174
 0.5823869764908659
 0.0
 0.5026316274194359
 0.0
 0.8735804647362989
source
DesirabilityScores.d_rankMethod
d_rank(x; low_to_high = true, method = :ordinal)

Values are ranked from low to high or high to low, and then the ranks are mapped to a 0-1 scale.

Arguments

  • x: A non-empty vector of values to map. Non-missing elements must be a subtype of Real.

  • low_to_high: If true, low ranks have high desirabilities; if false, high ranks have high desirabilities. Defaults to true.

  • method: A symbol specifying the method that should be used to rank x. Options include :ordinal, :compete, :dense, and :tied. Note these are the same options offered by ranking functions in StatsBase.jl (which this function uses). See that package's documentation for more details.

Examples

julia> to_rank = [5,10,-4.5, 8, pi, exp(1), -100]
7-element Vector{Float64}:
    5.0
   10.0
   -4.5
    8.0
    3.141592653589793
    2.718281828459045
 -100.0

julia> d_rank(to_rank; method = :compete) 
7-element Vector{Float64}:
 0.3333333333333333
 0.0
 0.8333333333333334
 0.16666666666666666
 0.5
 0.6666666666666666
 1.0
source
DesirabilityScores.des_plotMethod
des_plot(x, y; des_line_col = :black, des_line_width = 3, hist_args...)

Plots a histogram and overlays the desirability scores.

Arguments

  • x: A non-empty vector of values to map. Non-missing elements must be a subtype of Real. Need not be sorted – this is done before passing to the plotting function. This also means that tuples are not acceptable (since they are immutable).

  • y: A non-empty vector of desirability scores. Need not be sorted, but must be in the proper order with respect to x (i.e., datum x[1] has desirability y[1]. As with x, tuples are not acceptable.

  • des_line_col: A string or symbol specifying color of the line.

  • des_line_width: An integer specifying the line width.

  • hist_args...: Additional arguments for the Plot.jl's histogram() function.

Examples

    x = randn(1000)
    y = d_high(x, -1, 1; des_min = 0.1, des_max = 0.8, scale = 2)

    des_plot(x, y, des_line_col = :orange1; color = :steelblue)
source

References

Farmer P, Bonnefoi H, Becette V, Tubiana-Hulin M, Fumoleau P, Larsimont D, Macgrogan G, Bergh J, Cameron D, Goldstein D, Duss S, Nicoulaz AL, Brisken C, Fiche M, Delorenzi M, Iggo R. Identification of molecular apocrine breast tumours by microarray analysis. Oncogene. 2005 24(29):4660-4671.

Lazic SE (2015). Ranking, selecting, and prioritising genes with desirability functions. PeerJ 3:e1444.