If you’ve ever tried to clone a repository from GitHub and gotten a “Permission denied (publickey)” error, you may need to create an ssh key and share it with GitHub. This post will walk through that process. The commands used in this post are for Mac and Linux.
This post is in a series on doing machine learning with unbalanced datasets. This post focuses on the evaluation aspect in particular. For background, please see the setup post.
This post is in a series on machine learning with unbalanced datasets. This post focuses on the makeup of the validation set in particular. For background, please see the setup post.
This post is in a series on machine learning with unbalanced datasets. This post focuses on the training aspect. For background, please see the setup post.
This post is the first in a series on working with unbalanced data. We’ll answer questions like how to train a model, how to validate it, and how to test it. Is it better than your datasets be balanced or representative of the real-world distribution?
As machine learning has continued to expand, so has the need for data. I’ve put together some of my favorite resources for finding datasets. I hope they are some service.
I find it difficult to keep up with the latest in machine learning, even though it’s part of my full-time job. Fortunately, there are a lot of resources out there to help sort through it all. I thought I would put together a list of resources that I use in case it helps anyone else. If you know of a great resource that I’m missing, please let me know!