SEM-Machine-Learning Glossary

To access Machine Learning and apply its techniques to Search Engine Marketing it’s helpful to know about a couple of basic terms you’ll cross quite often. All of these terms are commonly used  on these pages and in Machine Learning in general.

Term Explanation
Algorithm Process of defined (mathematical) steps to solve a certain problem
Artificial Intelligence The process of building inteligent systems that are able to mirror human behaviour
Attributes Each particular observation your datasets includes in its columns
Classification Routine of categorizing values into different classes and groups of data. Cam be used for audience- or keyword-classification a.o.
Click Through Rate The number of websiteclicks divided by the number of Impressions
Clustering Technique of grouping data into different categories based on similarities and observations. k-Means is a widely used algorithm for this problem
Confusion matrix Tool to visualize and describe the results and performance of a classification model in Machine Learning
Conversions The defined and measured action a user makes on your website based on your KPIs
Correlation Statistical metric that provides the size of fluctation betweentwo or more variables
csv Comma seperated value files are a file format to store data in tables
Data cleaning Process of removing and replacing wrong data, fixing missing data, detecting outliers to prepare your csv file for Machine Learning
Data collection Procedure of gaining relevant data from all kind of sources like website-tracking, CRM or 3rd-party providers
Data science The wonderful discipline of gaining knowledge and insights from data
Dataset Collection of datapoints build from rows and columns including variables and features
Decision Tree Classification algorithm within Supervised Learning that predicts new values based learned rules and decisions and representing the output in tree-form
Deep learning One of several AI applications. Deep Learning imitates the human brain functions while performing a certain task repeatedly and learning from huge datasets
Descreptive statistics Figures thate are used to describe data and put them together in columns and rows
Feature The data-input provided to a Machine Learning Model in Supervised Learning
Feature selection Technique to select the most valuable features for your purpose from your columns and get rid of the irrelevant ones
Histogram Visualization tool that displays data in groups and logical ranges
Impressions Number of times your search ad was shown to the user
Input The labeled information (x) you provide to your machine in supervised learning
K-means Clustering Algorithm within Unsupervised Learning Models that calculates the distance between entities to group data in clusters. Often used for market- or audience segmentation.
K-Nearest Neighbours KNN is a classifification algorithm within Supervised Learning which uses the datapoints close to each other as a reference point to group and cluster raw data
Kaggle Online Community of data scientists, well known for it’s Machine Learning competitions
Lable The data-output provided to a Machine Learning Model in Supervised Learning
Linear Regression Regression algorithm within Supervised Learning which uses labeled data to predict new values
Logistic Regression Classification algorithm within Supervised Learning which uses the dependency of one ore more variables to predict an outcome
Machine Learning Subset of Artificial Intelligence that works with huge amount of data to predict new values, find patterns and similiarities e.g. without explicity beeing programmed
Matplotlip Machine Learning library for data visualization through charts, scatterplots and histograms
Mean Statistical distribution describing the numerical average
Median Statistical distribution describing the value in the middle of a group of numbers
Model Depiction of computation operations that processes provided data through an algorithm
Model Accuracy The metric that defines the rate of correctnes and quality of a model based on a test with provided data
Naive bayes Classification algorithm within Supervised Learning that uses the independency of one ore more variables to predict an outcome
Numpy Machine Learning library to transform data into vectors, matrices, arrays and functions
Output
The labeled information (y) you provide to your machine in supervised learning. It’s the result-information we already know but need to combine with (x) to find outliers and errors
Overfitting Describes the “too good” performance of a model referring to its target function
Pandas Machine Learning library with csv-reading function for data import and data-cleaning functions
Probability Field in the mathematical discipline of statistics that quantifies chance of offurance refering to certain events. Commonly used or click-prediction in ad-tech
Python The most popular and mainly used programming language in Machine Learning and Data Science
Pytorch Machine Learning framework written in Lua language, which is based on imperative programming
R Commonly used programming language in Machine Learning for statistical purposes
Random Forest Classification algorithm within Supervised Learning that uses a group of Decision Trees to predict new values
Reinforcement Learning
Type of Machine Learning where the model follows a trial and error approach and learns from exploration and mistakes. It’s the backbone of Real Time Bidding in SEM.
scikit learn Machine Learning framework written in Python that provides several algorithms for supervised and unsupervised learning
Segmentation Clustering datapoints into different groups based on similiarities and patterns
Standard Deviation Numerical value in statistics that describes how the numbers of a group differs from the mean
Statistical Fit Numerical value that displays the accuracy of your approximation referring to your target
Statistics Mathematical discipline dealing with analyzing and interpreation of numerical data
Stochastic Gradient Descent Mathematical function that can be used for error fixing to adjust and optimize parameters within a dataset. Commonly used as learning algorithm in very large datasets
Supervised Learning In Supervised Learning you train an algorithm by providing labeled data-input by yourself
Support Vector Machines Classification algorithm within Supervised Learning that splits a datasets into pre-defined categories
Tensorflow Machine Learning framework with capacities to build Deep Learning models and Neural Networks
Tracking The process of gaining data from interactions users have with your website
Training The process of providing input-data to your model to improve accuracy
Underfitting Describes the poor performance of a model referring to its target function
Unsupervised Learning Unsupervised Learning works without labeled data and training. Algorithms in USL cluster and group dataset based on similarities, patterns and associations.