{"id":26601,"date":"2019-04-10T09:17:59","date_gmt":"2019-04-10T13:17:59","guid":{"rendered":"https:\/\/centricconsulting.com\/?p=26601"},"modified":"2023-10-11T08:50:59","modified_gmt":"2023-10-11T12:50:59","slug":"machine-learning-a-quick-introduction-and-five-core-steps","status":"publish","type":"post","link":"https:\/\/centricconsulting.com\/blog\/machine-learning-a-quick-introduction-and-five-core-steps\/","title":{"rendered":"Machine Learning: A Quick Introduction and Five Core Steps"},"content":{"rendered":"
In this blog, we talk about why everyone should get excited about artificial intelligence and machine learning<\/a>. Machine learning<\/a> (ML) continues to grow in its impact, providing exciting learning opportunities for technologists like us.<\/p>\n So what is ML exactly? I\u2019ll explain the basics below.<\/p>\n Before getting deep into ML, let\u2019s start with a basic definition.<\/p>\n We have seen many complex definitions, but the one I find most impactful is also one of the simplest: Machine learning \u201cgives computers the ability to learn without being explicitly programmed\u201d<\/strong> (Arthur Samuel, 1959).<\/p>\n ML started in the \u201950s and has risen and fallen in fashion over the years. However, ML is in its prime now thanks to the popularity of Cloud technologies. Cloud enables ML to ingest and compute enormous amounts of data, allowing it to be more powerful<\/strong>. Additionally, new Cloud services allow ML to be much more accessible than previously known.<\/p>\n The predictive features of ML allow it to be highly useful in things like fraud detection, customer services, energy production, healthcare, security, manufacturing, and many others.<\/p>\n There are two basic types type of ML: unsupervised and supervised. The essential difference between supervised and unsupervised Learning are the types of data they ingest and the algorithms they leverage. Unsupervised Learning uses unlabeled data and \u201cself-guided\u201d learning algorithms<\/strong>. Supervised learning, on the other hand, uses labeled data and defined training algorithms<\/strong>.<\/p>\n The primary goals are also different. In supervised learning, predictive analytics is the main goal. In contrast, unsupervised learning focuses on finding data patterns<\/a>.<\/p>\n When we think about predicting outcomes with ML, we are typically referring to supervised learning.<\/p>\n Most of AWS ML Services orient towards supervised learning. Some of the most commonly used services are:<\/p>\n That said, services like Amazon EMR with Spark Machine Learning Library are useful for unlabeled data and unsupervised learning.<\/p>\n There are five core tasks<\/a> in the common ML workflow:<\/p>\n The first step in the machine learning process is getting data.<\/p>\n This process depends on your project and data type. For example, are you planning to collect real-time data from an IoT system or static data from an existing database?<\/p>\n If you are looking for data to practice building a machine learning model<\/a> with, you can also use data from internet sites like Kaggle.<\/p>\n Real-world data often has unorganized, missing, or noisy elements. Therefore, for machine learning success<\/a>, after we chose our data, we need to clean, prepare, and manipulate the data.<\/p>\n This process is a critical step, and people typically spend up to 80 percent of their time in this stage. Having a clean data set helps<\/a> with your model\u2019s accuracy down the road.<\/strong><\/p>\n After getting the data to a state you like, you need to convert the data sets into valid formats for your chosen ML platform. For example, you may need to translate the data into a .CSV file and upload to AWS S3.<\/p>\n Finally, you split your data into training and test data sets. The training set is used to train the model in the next step, while the test data is used to validate the model in the fourth step. The typical default is a 70\/30 split between training and test sets.<\/p>\n This step is where the magic happens! The data set connects to an algorithm, and the algorithm leverages sophisticated mathematical modeling to learn and develop predictions.<\/p>\n These algorithms commonly fall into one of three categories:<\/p>\nComputers can learn!<\/h2>\n
Data and Learning<\/h2>\n
AWS ML Services<\/h2>\n
\n
Machine Learning Workflow<\/h2>\n
1. Get Data<\/h3>\n
2. Clean, Prepare & Manipulate Data<\/h3>\n
3. Train Model<\/h3>\n
\n
4. Test Model<\/h3>\n