Search

Voiko vähästä oppia : koneoppimisen haasteet pienellä aineistolla

QR Code

Voiko vähästä oppia : koneoppimisen haasteet pienellä aineistolla

This bachelor’s thesis deals with machine learning with little data. In machine learning, the machine improves its performance to solve a specific task independently as more experience or data accumulates. Machine learning problems can be divided into classification and regression problems. Usually, machine learning tasks require large data to train an accurate machine learning model, but often obtaining large enough data is problematic. The aim of this thesis is to review the problems encountered in training a machine learning model when there is only little data available and solutions to these problems. The thesis was made as a literature review. The publications examined deal with the above-mentioned problems, as well as the solutions developed for them. In the thesis it became clear that it is more challenging to teach a machine learning model that generalizes well with little material, and it is difficult to avoid overfitting. In order to generalize better, we examine SMOTE technology to generate synthetic data and to prevent overfitting we talk about regularization.

Saved in: