I have mixed feeling about cliffhangers and I hope you do not hate me for leaving you with one in the last blog. In this episode, we will look at the last piece of the puzzle for predictions using Cloud Machine Learning Engine REST APIs.
As promised, we will look at some example TensorFlow code. The following will be the base TensorFlow code our Google Cloud Machine Learning (ML) Engine model we will be built upon:
If you think machine learning is a panacea for every business challenges and sell it as such, you’re doing it wrong. The best way to jeopardize your business is to go all in with machine learning by following the 5 tips below.
When dealing with sequences, Viterbi algorithm and Viterbi decoding pops up regularly. This algorithm is usually described in the context of Hidden Markov Models. However, the application of this algorithm is not limited to HMMs. Besides, HMMs lately fell out of fashion as better Machine Learning techniques have been developed.
Sequence labeling is one of the classic ML tasks, that include well-studied problems of Part-of-Speech (POS) tagging, Named Entity Recognition (NER), Address parsing, and more. Here I want to discuss two related topics: tokenization, and satisfying constrains imposed by the structure of input document.
In the first of these posts, we covered the (now) conventional wisdom that having a bigger dataset is better for training machine learning algorithms. The second of the series detailed a few rules of thumb for creating quality datasets. This time around, we’ll look at how to start building datasets.
In the first of these posts, we covered the now conventional wisdom that having a bigger dataset is better for training machine learning algorithms. But size is not the only metric for success, quality is also critical.
There was a time when working with big data was not technically possible because our compute resources couldn’t handle the amount of information involved. Beyond that, it took a while for the use case to develop around massive computing resources, so it wasn’t even considered a worthy pursuit. 15 years ago, I remember creating machine-learning algorithms using only a handful data points and then tweaking features representation for weeks. Back then, it was quite challenging to process the 20 newsgroup dataset and its 19 thousand news items.
Even as recently as five years ago, the situation hadn’t improved much. At that time, I worked on putting a learning system with a continuous retroaction loop into production. To fit the budget, we could only train the Random Forest with 5,000 examples – only a few days of data. Using such a small data set alone would not have produced the desired results, so we had to implement many tricks to keep ‘some’ past data alongside the continuous feed of new data to keep everything running smoothly.
The first Innodata web service allows cross references to law and rule books within legal documents to be annotated.
Exactly how much does deep learning cost? And are those prices fixed, or can they be optimized? Let me compare some cloud hardware and get down to dollars and cents to uncover some answers.
Deep Learning is cool. But it’s also expensive. So it makes sense to look at what options are available for training deep models in the cloud.