Index of /teaching/dav_20/labs/lab13/
Name Last modified Size Description
Parent Directory 01-Jun-2020 11:54 -
titanic 01-Jun-2020 11:54 -
test.csv 01-Jun-2020 12:10 28k
train.csv 01-Jun-2020 12:10 60k
Use the passenger data from Titanic shipwreck to answer question
"what sorts of people were more likely to survive?”
You will be given: name, age, gender, socio-economic class, etc)
The data has been split into two groups:
- training set (train.csv)
- test set (test.csv)
pclass: A proxy for socio-economic status (SES)
1st = Upper, 2nd = Middle, 3rd = Lower
age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5
sibsp: The dataset defines family relations in this way...
Sibling = brother, sister, stepbrother, stepsister
Spouse = husband, wife
parch: The dataset defines family relations in this way...
Parent = mother, father
Child = daughter, son, stepdaughter, stepson
Some children traveled only with a nanny, therefore parch=0 for them.
3) ML models building (deep learning)
- install tensorflow & keras
- train dense model (at least two dense layers, adam, relu/softsign, dropout layers)
- print the model structure and save in text or image file (screenshot), model.summary()
- calculate scores and save models in json&hdf5 format, model.to_json()
- make prediction script (command-line tool that asks for age, gender,
socio-economic class, etc and return the prediction)
Proudly Served by LiteSpeed Web Server at bioinformatics.netmark.pl Port 80
socio-economic class: 1
Most likely you would not survive the titanic crash (DEAD 0.9311)
* GPU vs CPU
For our exercises, it is sufficient to use only CPU, but in real-life scenarios
deep learning training can require a lot of RAM and computing power. Thus, GPU
can be used, but this is also the tricky part. First of all, you need nVidia cards
(most laptops do not have the separate graphic card). Next, you need to install
special nVidia driver supporting deep learning (it seems easy to install, but
frequently it may lead to serious problems including complete system failure or
re-installation of X environment).
Make pdf report with all plots and the tables summarizing the Titanic parts 1-3.
If possible, make some conclussions.
Additionally provide all scripts, separate plot image files like:
- initial data exploration plots
- the decision trees visualizations
- for deep learning provide also json&hdf models
All files should be sent until 07.06.2020
via email to firstname.lastname@example.org with the email subject:
'lab13_hw_Name_Surname' without email text body and with
'lab13_hw_Name_Surname.7z' (ASCII letters only) attachment.
All emails with a different structure (the one that will not go
through email filter to the proper email folder dedicated for
home works) will be scored -10%
Using non-English labels, legends, descriptions, etc. will be scored -10%