You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#' See [Survival Analysis with Accelerated Failure Time](https://xgboost.readthedocs.io/en/latest/tutorials/aft_survival_analysis.html) for details.
459
461
#' - `"multi:softmax"`: set XGBoost to do multiclass classification using the softmax objective, you also need to set num_class(number of classes)
460
462
#' - `"multi:softprob"`: same as softmax, but output a vector of `ndata * nclass`, which can be further reshaped to `ndata * nclass` matrix. The result contains predicted probability of each data point belonging to each class.
461
-
#' - `"rank:ndcg"`: Use LambdaMART to perform pair-wise ranking where [Normalized Discounted Cumulative Gain (NDCG)](https://en.wikipedia.org/wiki/NDCG) is maximized. This objective supports position debiasing for click data.
462
-
#' - `"rank:map"`: Use LambdaMART to perform pair-wise ranking where [Mean Average Precision (MAP)](https://en.wikipedia.org/wiki/Mean_average_precision#Mean_average_precision) is maximized
463
+
#' - `"rank:ndcg"`: Use LambdaMART to perform pair-wise ranking where the normalized discounted cumulative gain (NDCG) is maximized. This objective supports position debiasing for click data.
464
+
#' - `"rank:map"`: Use LambdaMART to perform pair-wise ranking where the mean average precision (MAP) is maximized
463
465
#' - `"rank:pairwise"`: Use LambdaRank to perform pair-wise ranking using the `ranknet` objective.
464
-
#' - `"reg:gamma"`: gamma regression with log-link. Output is a mean of gamma distribution. It might be useful, e.g., for modeling insurance claims severity, or for any outcome that might be [gamma-distributed](https://en.wikipedia.org/wiki/Gamma_distribution#Occurrence_and_applications).
465
-
#' - `"reg:tweedie"`: Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any outcome that might be [Tweedie-distributed](https://en.wikipedia.org/wiki/Tweedie_distribution#Occurrence_and_applications).
466
+
#' - `"reg:gamma"`: gamma regression with log-link. Output is a mean of gamma distribution. It might be useful, e.g., for modeling insurance claims severity, or for any outcome that might be gamma-distributed.
467
+
#' - `"reg:tweedie"`: Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any outcome that might be tweedie-distributed.
466
468
#' @param verbosity (default=1)
467
469
#' Verbosity of printing messages. Valid values are 0 (silent), 1 (warning), 2 (info), 3
468
470
#' (debug). Sometimes XGBoost tries to change configurations based on heuristics, which
#' - Evaluation metrics for validation data, a default metric will be assigned according to objective (rmse for regression, and logloss for classification, `mean average precision` for ``rank:map``, etc.)
614
616
#' - User can add multiple evaluation metrics.
615
617
#' - The choices are listed below:
616
-
#' - `"rmse"`: [root mean square error](https://en.wikipedia.org/wiki/Root_mean_square_error)
618
+
#' - `"rmse"`: root mean square error
617
619
#' - `"rmsle"`: root mean square log error: \eqn{\sqrt{\frac{1}{N}[log(pred + 1) - log(label + 1)]^2}}. Default metric of `"reg:squaredlogerror"` objective. This metric reduces errors generated by outliers in dataset. But because `log` function is employed, `"rmsle"` might output `nan` when prediction value is less than -1. See `"reg:squaredlogerror"` for other requirements.
#' - `"mphe"`: mean Pseudo Huber error. Default metric of `"reg:pseudohubererror"` objective.
623
+
#' - `"logloss"`: negative log-likelihood.
622
624
#' - `"error"`: Binary classification error rate. It is calculated as `#(wrong cases)/#(all cases)`. For the predictions, the evaluation will regard the instances with prediction value larger than 0.5 as positive instances, and the others as negative instances.
623
625
#' - `"error@t"`: a different than 0.5 binary classification threshold value could be specified by providing a numerical value through 't'.
624
626
#' - `"merror"`: Multiclass classification error rate. It is calculated as `#(wrong cases)/#(all cases)`.
#' - `"auc"`: [Receiver Operating Characteristic Area under the Curve](https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve).
628
+
#' - `"auc"`: area under the receiver-operating characteristic curve.
627
629
#' Available for classification and learning-to-rank tasks.
628
630
#' - When used with binary classification, the objective should be `"binary:logistic"` or similar functions that work on probability.
629
631
#' - When used with multi-class classification, objective should be `"multi:softprob"` instead of `"multi:softmax"`, as the latter doesn't output probability. Also the AUC is calculated by 1-vs-rest with reference class weighted by class prevalence.
630
632
#' - When used with LTR task, the AUC is computed by comparing pairs of documents to count correctly sorted pairs. This corresponds to pairwise learning to rank. The implementation has some issues with average AUC around groups and distributed workers not being well-defined.
631
633
#' - On a single machine the AUC calculation is exact. In a distributed environment the AUC is a weighted average over the AUC of training rows on each node - therefore, distributed AUC is an approximation sensitive to the distribution of data across workers. Use another metric in distributed environments if precision and reproducibility are important.
632
634
#' - When input dataset contains only negative or positive samples, the output is `NaN`. The behavior is implementation defined, for instance, `scikit-learn` returns \eqn{0.5} instead.
633
-
#' - `"aucpr"`: [Area under the PR curve](https://en.wikipedia.org/wiki/Precision_and_recall).
635
+
#' - `"aucpr"`: area under the PR curve
634
636
#' Available for classification and learning-to-rank tasks.
635
637
#'
636
638
#' After XGBoost 1.6, both of the requirements and restrictions for using `"aucpr"` in classification problem are similar to `"auc"`. For ranking task, only binary relevance label \eqn{y \in [0, 1]} is supported. Different from `"map"` (mean average precision), `"aucpr"` calculates the *interpolated* area under precision recall curve using continuous interpolation.
637
639
#'
638
640
#' - `"pre"`: Precision at \eqn{k}. Supports only learning to rank task.
0 commit comments