The Machine Learning Google doesn’t show you in Smartbidding
Google’s auto-bidding-optimization-feature towards certain business-goals, called smartbidding, is a great tool. You can save tons of time and money by automating your bidding. In Google’s own words: Smartbidding helps to “make more accurate predictions across your account”.
The flipside is the blackbox you’ve to deal with. You throw something in (input and output figures) and hope for efficient results based on the data you’ve provided. All the steps inbetween, including feature selection or regularization, Google is doing for you quitely.
So let’s imagine the smartbidding-algorithm(s) would be your own and you’d have full insight. What more would you see? What possilibilities of model-optimization would you have?
1. Learning rate & Gradient Descent convergence
In Machine Learning Models the learning rate is the hyperparameter that defines the size of the adjustment-steps that the Gradient Descent algorithm (the commonly chosen algorithm for regression and classification problems) takes, to find the optimal fit for the algorithm function, or in other words: the way to ensure the highest accuracy.
A too low defined learning rate can cause a slow computing and data-processing. A learning rate that’s too high can destroy your models accuracy, because the gradients steps are too big and let’s your model jumping around like crazy.
The graph-tool on Google’s developer page shows you the effect of a wrong chosen learning rate quite well: https://developers.google.com/machine-learning/crash-course/fitter/graph
You can define the learning rate by yourself – normally you start with something between 0.1 and 1 – but there are also a couple of algorithms that automatically can pick a learning rate for you.
To track, if our algorithm is moving into the right direction we plot the data to our graph and investigate, if Gradient Descent is converging. This progress is something you can’t retrace when using Smartbidding.
2. Feature-Selection & parameter-weight-updates
In Supervised Machine Learning we work with labeled data in form of csv-files. Everything comes in rows and columns.
Artificial Neural Networks – the fundamental architecture of Google’s Smartbidding – are organized in different kind of layers. Input layer, Hidden Layer and Output layer, where the 2nd layers dedication is all about updating the parameter-weights based on the learning the model makes.
When we use Smartbidding, we can roughly assume what’s the input of layer one (user meta-data like device, browser, location, affinities e.g.) but we have no idea about the weights within Google’s algorithm. We don’t know which parameter is considered as the most important one by the algorithm and we don’t see how this might change as soon as the model learns.
3. Split-ratio between training and test-data
One of the fundamental steps in the beginning of each machine learning process is to split the data into two subsets: training-data and test-data. (X_train, X_test, Y_train, Y_test = train_test_split(X, Y, random_state = 0, test_size = 0.30)
The training-data is the one we use for our training. The model learns from this data
The test-data we use to evaluate the model’s performance.
A commonly used split-ratio between train and test is 70/30. As you can imagine this ratio will have a huge impact on our models performance. But in Google’s Smartbidding these values remain a blackbox for us.
4. The output Score
Every algorithm is dedicated to a specific task and to a certain output and result we’d like to see.
This could be the probability for a users conversion, the correlation between two or more features or the similarities between users to build audience clusters. At the end of each algorithms computation there is a number. A score that helps to classify and evaluate the results.
For example: A user could be classified as a “Luxury Shopper”, because based on the Logistic Regression algorithm’s results there is a 60% probability that she fits into this audience-cluster. The low 0.6 output-score is something you’ll never see in the Google Ads dashboard.
5. Overfitting, Underfitting and Regularization-Rate
Data is the gasoline of every Machine Learning model. But what, if we run out of gas? What if we have to less data to let our model learn from it in a proper way?
This phenomenon is called “underfitting”. The algorithm will not be able to find the perfect fit and will cause poor performance on the training data.The danger of data-underfitting is the reason why Google defined a minimum of 50 conversions within the last 30 days as a requirement.
So we can assume then more data we have the better, right? Kind of. But on the other side of the spectrum there’s overfitting already waiting for us. If we have too many features included in our model will be distracted and will not be able to make new assumptions. Generalizing fails. The accuracy for the original data might be good, but it will fail when predicting values for unknown data, e.g. predict future conversions.
There are two common ways how you can avoid overfitting.
- You can reduce the number of features: As soon as you realize that one feature ( e.g. age of the users) does not contribute much to the output, you could get rid of it to focus on other features and provide more clear signals to the algorithm.
- You could reduce the magnitude of your parameters by using regularization and adding an additional term (lambda) to the algorithm, which will reduce the values. In this case you’d define a certain rate of regularization which has an impact on the new magnitude.
Both ways are conceivable and widely used, but the way how Google does it, is hidden for us.
I hope this article was useful for you and you can integrate the knowledge into your future work.