Hinweise: - Tunen Sie mtry - Verwenden Sie Kreuzvalidierung - Verwenden Sie Standardwerte, wo nicht anders angegeben. - Fixieren Sie Zufallszahlen auf den Startwert 42.
Rows: 344 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): species, island, sex
dbl (6): rownames, bill_length_mm, bill_depth_mm, flipper_length_mm, body_ma...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# rm NA in the dependent variable:d <- d %>%drop_na(body_mass_g)set.seed(42)d_split <-initial_split(d)d_train <-training(d_split)d_test <-testing(d_split)# model:mod_rf <-rand_forest(mode ="regression",mtry =tune())# cv:set.seed(42)rsmpl <-vfold_cv(d_train)# recipe:rec_plain <-recipe(body_mass_g ~ ., data = d_train) %>%step_impute_bag(all_predictors())# workflow:wf1 <-workflow() %>%add_model(mod_rf) %>%add_recipe(rec_plain)# tuning:tic()wf1_fit <- wf1 %>%tune_grid(resamples = rsmpl)
i Creating pre-processing data to finalize unknown parameter: mtry
toc()
23.078 sec elapsed
# best candidate:show_best(wf1_fit)
Warning: No value of `metric` was given; metric 'rmse' will be used.
# A tibble: 5 × 7
mtry .metric .estimator mean n std_err .config
<int> <chr> <chr> <dbl> <int> <dbl> <chr>
1 2 rmse standard 282. 10 11.1 Preprocessor1_Model5
2 3 rmse standard 282. 10 10.6 Preprocessor1_Model7
3 8 rmse standard 282. 10 9.84 Preprocessor1_Model2
4 5 rmse standard 283. 10 9.41 Preprocessor1_Model3
5 4 rmse standard 283. 10 9.95 Preprocessor1_Model4
Warning: No value of `metric` was given; metric 'rmse' will be used.
wf1_fit_final <- wf1_final %>%last_fit(d_split)# Modellgüte im Test-Set:collect_metrics(wf1_fit_final)
# A tibble: 2 × 4
.metric .estimator .estimate .config
<chr> <chr> <dbl> <chr>
1 rmse standard 327. Preprocessor1_Model1
2 rsq standard 0.817 Preprocessor1_Model1
Achtung: step_impute_knn scheint Probleme zu haben, wenn es Charakter-Variablen gibt.
Categories:
tidymodels
statlearning
template
string
Source Code
---exname: rf-finalizeexpoints: 1extype: stringexsolution: NAcategories:- tidymodels- statlearning- template- stringdate: '2023-05-17'slug: rf-finalizetitle: rf-finalize---# Aufgabe<!-- Schreiben Sie eine Vorlage für eine prädiktive Analyse mit Tidymodels! -->Berechnen Sie ein prädiktives Modell mit dieser Modellgleichung:`body_mass_g ~ .` (Datensatz: palmerpenguins::penguins).Berichten Sie den RSMSE im Test-Sample!Hinweise:- Tunen Sie `mtry`- Verwenden Sie Kreuzvalidierung- Verwenden Sie Standardwerte, wo nicht anders angegeben.- Fixieren Sie Zufallszahlen auf den Startwert 42.</br></br></br></br></br></br></br></br></br></br># Lösung```{r}# Setup:library(tidymodels)library(tidyverse)library(tictoc) # Zeitmessung# Data:d_path <-"https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv"d <-read_csv(d_path)# rm NA in the dependent variable:d <- d %>%drop_na(body_mass_g)set.seed(42)d_split <-initial_split(d)d_train <-training(d_split)d_test <-testing(d_split)# model:mod_rf <-rand_forest(mode ="regression",mtry =tune())# cv:set.seed(42)rsmpl <-vfold_cv(d_train)# recipe:rec_plain <-recipe(body_mass_g ~ ., data = d_train) %>%step_impute_bag(all_predictors())# workflow:wf1 <-workflow() %>%add_model(mod_rf) %>%add_recipe(rec_plain)# tuning:tic()wf1_fit <- wf1 %>%tune_grid(resamples = rsmpl)toc()# best candidate:show_best(wf1_fit)# finalize wf:wf1_final <- wf1 %>%finalize_workflow(select_best(wf1_fit))wf1_fit_final <- wf1_final %>%last_fit(d_split)# Modellgüte im Test-Set:collect_metrics(wf1_fit_final)```Achtung: `step_impute_knn` scheint Probleme zu haben, wenn es Charakter-Variablen gibt.---Categories: - tidymodels- statlearning- template- string