library(tidyverse)
library(easystats)
ames-kaggle1
regression
data
kaggle
string
kaggle
Aufgabe
Berechnen Sie ein einfaches lineare Modell für die Ames House Price Kaggle Competition.
Hinweise:
- Orientieren Sie sich im Übrigen an den allgemeinen Hinweisen des Datenwerks.
Lösung
Pakete starten
Daten importieren
<- "https://raw.githubusercontent.com/sebastiansauer/Lehre/main/data/ames-kaggle/train.csv"
d_train_path_online <- "https://raw.githubusercontent.com/sebastiansauer/Lehre/main/data/ames-kaggle/test.csv"
d_test_path_online <- read_csv(d_train_path_online) d_train
Rows: 1460 Columns: 81
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (43): MSZoning, Street, Alley, LotShape, LandContour, Utilities, LotConf...
dbl (38): Id, MSSubClass, LotFrontage, LotArea, OverallQual, OverallCond, Ye...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
<- read_csv(d_test_path_online) d_test
Rows: 1459 Columns: 80
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (43): MSZoning, Street, Alley, LotShape, LandContour, Utilities, LotConf...
dbl (37): Id, MSSubClass, LotFrontage, LotArea, OverallQual, OverallCond, Ye...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Model definieren
<- lm(SalePrice ~ OverallQual, data = d_train) m1
Neue Daten vorhersagen
<- predict(m1, newdata = d_test) m1_pred
Daten einreichen
<-
d_subm %>%
d_test select(Id) %>%
mutate(SalePrice = m1_pred)
head(d_subm)
# A tibble: 6 × 2
Id SalePrice
<dbl> <dbl>
1 1461 130973.
2 1462 176409.
3 1463 130973.
4 1464 176409.
5 1465 267280.
6 1466 176409.
write_csv(d_subm, file = "einreichen-kaggle-modell1-yeah.csv")
Categories:
- regression
- ames
- kaggle
- string