public abstract class ProbabilisticClassificationModel<FeaturesType,M extends ProbabilisticClassificationModel<FeaturesType,M>> extends ClassificationModel<FeaturesType,M>
Model produced by a ProbabilisticClassifier
.
Classes are indexed {0, 1, ..., numClasses - 1}.
Constructor and Description |
---|
ProbabilisticClassificationModel() |
Modifier and Type | Method and Description |
---|---|
Param<java.lang.String> |
featuresCol()
Param for features column name.
|
java.lang.String |
getFeaturesCol() |
java.lang.String |
getLabelCol() |
java.lang.String |
getPredictionCol() |
java.lang.String |
getRawPredictionCol() |
Param<java.lang.String> |
labelCol()
Param for label column name.
|
static void |
normalizeToProbabilitiesInPlace(DenseVector v)
Normalize a vector of raw predictions to be a multinomial probability vector, in place.
|
Param<java.lang.String> |
predictionCol()
Param for prediction column name.
|
protected Vector |
predictProbability(FeaturesType features)
Predict the probability of each class given the features.
|
protected double |
probability2prediction(Vector probability)
Given a vector of class conditional probabilities, select the predicted label.
|
protected double |
raw2prediction(Vector rawPrediction)
Given a vector of raw predictions, select the predicted label.
|
protected Vector |
raw2probability(Vector rawPrediction)
Non-in-place version of
raw2probabilityInPlace() |
protected abstract Vector |
raw2probabilityInPlace(Vector rawPrediction)
Estimate the probability of each class given the raw prediction,
doing the computation in-place.
|
Param<java.lang.String> |
rawPredictionCol()
Param for raw prediction (a.k.a.
|
M |
setProbabilityCol(java.lang.String value) |
M |
setThresholds(double[] value) |
DataFrame |
transform(DataFrame dataset)
Transforms dataset by reading from
featuresCol , and appending new columns as specified by
parameters:
- predicted labels as predictionCol of type Double
- raw predictions (confidences) as rawPredictionCol of type Vector
- probability of each class as probabilityCol of type Vector . |
StructType |
validateAndTransformSchema(StructType schema,
boolean fitting,
DataType featuresDataType) |
StructType |
validateAndTransformSchema(StructType schema,
boolean fitting,
DataType featuresDataType)
Validates and transforms the input schema with the provided param map.
|
numClasses, predict, predictRaw, setRawPredictionCol
featuresDataType, numFeatures, setFeaturesCol, setPredictionCol, transformImpl, transformSchema
transform, transform, transform
transformSchema
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
clear, copy, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn, validateParams
toString, uid
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
public static void normalizeToProbabilitiesInPlace(DenseVector v)
The input raw predictions should be >= 0. The output vector sums to 1, unless the input vector is all-0 (in which case the output is all-0 too).
NOTE: This is NOT applicable to all models, only ones which effectively use class instance counts for raw predictions.
v
- (undocumented)public M setProbabilityCol(java.lang.String value)
public M setThresholds(double[] value)
public DataFrame transform(DataFrame dataset)
featuresCol
, and appending new columns as specified by
parameters:
- predicted labels as predictionCol
of type Double
- raw predictions (confidences) as rawPredictionCol
of type Vector
- probability of each class as probabilityCol
of type Vector
.
transform
in class ClassificationModel<FeaturesType,M extends ProbabilisticClassificationModel<FeaturesType,M>>
dataset
- input datasetprotected abstract Vector raw2probabilityInPlace(Vector rawPrediction)
This internal method is used to implement transform()
and output probabilityCol
.
rawPrediction
- (undocumented)protected Vector raw2probability(Vector rawPrediction)
raw2probabilityInPlace()
protected double raw2prediction(Vector rawPrediction)
ClassificationModel
raw2prediction
in class ClassificationModel<FeaturesType,M extends ProbabilisticClassificationModel<FeaturesType,M>>
rawPrediction
- (undocumented)protected Vector predictProbability(FeaturesType features)
This internal method is used to implement transform()
and output probabilityCol
.
features
- (undocumented)protected double probability2prediction(Vector probability)
probability
- (undocumented)public StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
public Param<java.lang.String> rawPredictionCol()
public java.lang.String getRawPredictionCol()
public StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
schema
- input schemafitting
- whether this is in fittingfeaturesDataType
- SQL DataType for FeaturesType.
E.g., VectorUDT
for vector features.public Param<java.lang.String> labelCol()
public java.lang.String getLabelCol()
public Param<java.lang.String> featuresCol()
public java.lang.String getFeaturesCol()
public Param<java.lang.String> predictionCol()
public java.lang.String getPredictionCol()