

November 15, 2023


Schreiben Sie eine prototypische Analyse für ein Vorhersagemodell, das sich als Vorlage für Analysen dieser Art eignet!

Verzichten Sie auf Resampling und Tuning.


  • Berechnen Sie ein Modell
  • Tunen Sie keinen Parameter des Modells
  • Verwenden Sie keine Kreuzvalidierung.
  • Verwenden Sie Standardwerte, wo nicht anders angegeben.
  • Fixieren Sie Zufallszahlen auf den Startwert 42.


# Setup:
# Data:
d_path <- "https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv"
d <- read_csv(d_path)
d_split <- initial_split(d)
d_train <- training(d_split)
d_test <- testing(d_split)

# model:
mod1 <-
  rand_forest(mode = "regression")

# cv:
rsmpl <- vfold_cv(d_train)

# recipe:
rec1 <- recipe(body_mass_g ~  ., data = d_train) |> 
  step_unknown(all_nominal_predictors(), new_level = "NA") |> 
  step_naomit(all_predictors()) |> 
  step_dummy(all_nominal_predictors()) |> 
  step_zv(all_predictors()) |> 

# workflow:
wf1 <-
  workflow() %>% 
  add_model(mod1) %>% 

# tuning:
wf1_fit <-
  wf1 %>% 
  last_fit(split = d_split)
→ A | error:   Missing data in columns: bill_length_mm, bill_depth_mm, flipper_length_mm.
There were issues with some computations   A: x1
There were issues with some computations   A: x1
Warning: All models failed. Run `show_notes(.Last.tune.result)` for more
0.594 sec elapsed

Als Check: Das gepreppte/bebackene Rezept:

rec1_prepped <- prep(rec1)
d_train_baked <- bake(rec1_prepped, new_data = NULL)
d_train_baked |> 
# A tibble: 6 × 12
  rownames bill_length_mm bill_depth_mm flipper_length_mm    year body_mass_g
     <dbl>          <dbl>         <dbl>             <dbl>   <dbl>       <dbl>
1   -1.24          -1.53          0.386            -0.794 -1.29          3450
2    1.45           1.32          0.386            -0.365  1.14          3675
3   -0.212          0.401        -1.97              0.707 -1.29          4500
4   -0.993          0.343         0.887            -0.294 -0.0757        4150
5    0.530          0.879        -0.566             2.07  -0.0757        5800
6   -0.281         -0.957         0.787            -1.15   1.14          3650
# ℹ 6 more variables: species_Chinstrap <dbl>, species_Gentoo <dbl>,
#   island_Dream <dbl>, island_Torgersen <dbl>, sex_male <dbl>, sex_NA. <dbl>
Variable          |      Mean |     SD |     IQR |              Range | Skewness | Kurtosis |   n | n_Missing
rownames          | -5.63e-17 |   1.00 |    1.70 |      [-1.72, 1.68] |    -0.01 |    -1.21 | 257 |         0
bill_length_mm    | -2.97e-16 |   1.00 |    1.68 |      [-2.28, 2.98] |     0.01 |    -0.79 | 257 |         0
bill_depth_mm     |  2.71e-16 |   1.00 |    1.60 |      [-2.02, 2.19] |    -0.11 |    -0.87 | 257 |         0
flipper_length_mm | -9.83e-16 |   1.00 |    1.64 |      [-1.94, 2.07] |     0.32 |    -1.02 | 257 |         0
year              | -6.89e-14 |   1.00 |    2.43 |      [-1.29, 1.14] |    -0.12 |    -1.51 | 257 |         0
body_mass_g       |   4200.97 | 792.54 | 1212.50 | [2700.00, 6300.00] |     0.49 |    -0.69 | 257 |         0
species_Chinstrap | -2.24e-17 |   1.00 |    0.00 |      [-0.50, 1.98] |     1.49 |     0.22 | 257 |         0
species_Gentoo    |  1.64e-17 |   1.00 |    2.07 |      [-0.76, 1.31] |     0.56 |    -1.70 | 257 |         0
island_Dream      | -5.50e-17 |   1.00 |    2.08 |      [-0.75, 1.34] |     0.60 |    -1.66 | 257 |         0
island_Torgersen  |  1.72e-17 |   1.00 |    0.00 |      [-0.41, 2.43] |     2.04 |     2.18 | 257 |         0
sex_male          | -5.86e-17 |   1.00 |    2.00 |      [-0.96, 1.03] |     0.07 |    -2.01 | 257 |         0
sex_NA.           |  1.45e-17 |   1.00 |    0.00 |      [-0.15, 6.46] |     6.35 |    38.63 | 257 |         0


