Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Introduction

Introduction

Machine learning (ML) is arguably one of the most prominent tools in data science to advance water resources research. ML models are capable of learning complex underlying relationships of a system, and thus finds its applications in various water resources topics: from river ecosystems to water supply. We will cover a variety of learning algorithms and methods to optimize ML models so that they can generalize to unseen data, which will include in principle supervised and unsupervised learning techniques.

The goal of machine learning

Machine learning aims at computationally learning complex relationships from experience (i.e., data). Computational learning is a subfield of artificial intelligence (AI) that focuses on the development of models that enable computers to learn and make predictions or decisions without being explicitly programmed. It involves designing and implementing mathematical and statistical models that can automatically analyze data, identify patterns, and make informed decisions or predictions based on the observed data. This task may be, for instance, predicting or modelling complex phenomena. Note that predicting here does not refer only to the future but to any non-identified event. For instance, we can predict whether a chemical substance will be, or was, or is, dissolved in water given a set of environmental conditions.

Contrary to popular thinking, ML algorithms have been around for several decades. However, they were only given stark attention in the last decade when limitations in computing power were no longer an obstacle for applying ML algorithms for making helpful ML models. We refer to algorithms as the baseline commands that instruct a model how to learn from data, whereas a ML model is the result (i.e., the learned program) of learning the target task from the selected set of rules (ML algorithm) and examples (i.e., data).

Types of machine learning

In this section, we dealt mainly with basic elements of supervised learning, but note that there are several other types of ML problems. Some of which are:

The difference between machine learning and data science

The conceptual difference between data science and machine learning can be imagined similar to the concept of rectangles and squares in geometry, where data science corresponds to rectangles, and machine learning to squares. Both data science and machine learning deal with programming (e.g., in Python, R, or SQL), statistics, and data modeling. Data science additionally embraces data visualization and data wrangling.