Learning about Machine Learning (and Artificial Intelligence): part 1 of n

screenshot of realtime data visualization: https://word.threeceelabs.com

Background: I’m definitely no expert, and have no formal education, in machine learning (ML) nor artificial intelligence (AI), but I’m fascinated and intrigued by the technologies. I’ve been learning about them and building apps with them on my own for the past few years.

In this series of posts, I’ll explain what I know, how I got to where I am, how I’ve used them in my projects, and chart a course for my future with them. Hopefully this process will push me to learn more and may help those who later head down this path. This post is a very quick introduction to the science and a brief overview of my current ML project.

Unless you live off the grid (and probably even if you do live off the grid), machine learning (ML) and artificial intelligence (AI) touches your life. Even if you don’t directly use a ML or AI tool (like Siri or Alexa), this technology directly or indirectly permeates every nook and cranny of our modern world.

What is Machine Learning and Artificial Intelligence?

From Wikipedia:

Machine learning (ML) is the study of computer algorithms that improve automatically through experience.[1] It is seen as a subset of artificial intelligence. Machine learning algorithms build a model based on sample data, known as “training data”, in order to make predictions or decisions without being explicitly programmed to do so.[2] Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision, where it is difficult or unfeasible to develop conventional algorithms to perform the needed tasks.

Artificial intelligence (AI), is intelligence demonstrated by machines, unlike the natural intelligence displayed by humans and animals. Leading AI textbooks define the field as the study of “intelligent agents”: any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals.[3] Colloquially, the term “artificial intelligence” is often used to describe machines (or computers) that mimic “cognitive” functions that humans associate with the human mind, such as “learning” and “problem solving”.[4]

Why did I began learning about ML and AI?

My ML/AI journey began with events that birthed the #BlackLivesMatter movement in 2013. After another police killing of an unarmed black man and the resultant social reaction, a white, former high school classmate’s comment on Facebook prompted me to respond that I seemed to live in a different America than he did.

My feelings of disconnection prompted me to begin a personal art project to analyze and visualize social media activity surrounding #BlackLivesMatter as a way to try and make sense of how we (black and white people, generally) can view the same event and interpret it in nearly opposite ways. Initially, I created a website to visualize Twitter data (tweets that included the #BlackLivesMatter hashtag) in realtime. With the start of Donald Trump’s first presidential campaign in 2015, that project morphed into a project that looked at politics beyond #BlackLivesMatter, as it seemed to me that the forces at play were the same forces that were polarizing the country more broadly.

As part of this Twitter data visualization, I began to politically categorize Twitter users as “liberal”, “conservative” and “neutral.” It quickly became clear that manual categorization would be slow and tedious. I can categorize a user in about 5–10 seconds using their Twitter profile and 1 or 2 of their tweets. I thought there must be a way for me to use software to assist in categorizing Twitter users.

Initially, I coded a quick-and-dirty categorizer that searched for keywords in a user’s profile and tweets. It quickly became clear that this rules-based approach wasn’t very accurate and difficult to modify and maintain, and I began to explore the possibility of using ML/AI for categorization.

Quick Neural Network (NN) Primer:

ML and AI are primarily(?) based on artificial neural networks, software abstractions that roughly mimic the operation of biological neural networks.

These neural networks generate an output based on inputs. The inputs to the NNs are typically an array, where each input is “normalized” to a value between 0 and 1, and may be either a binary or floating point value. The output is typically also an array of one or more signals, also typically “normalized” between 0 and 1.

NNs are constructed with layers of nodes. Every NN has an input layer, connected to possibly one or more hidden layers, and then connected to one output layer.

A node is functionally equivalent to a biological neuron, where it has one or more inputs, and one output. Each node generates an output value (typically a float) based on the values of its inputs and some internal function and weights. The weights are adjusted during training such that the desired output values are generated from particular input values.

In real world NN applications, an input layer can have thousands of input nodes, several hidden layers with hundred of nodes in each layer, and several output nodes. Training these large NNs requires millions or billions of examples (input/output pairs) and vast compute resources.

My NN app Overview:

You can view the current live visualization here (note: BETA software): https://word.threeceelabs.com/session.

As I was most proficient in Javascript, I searched for an open source JS library for neural networks and found Neataptic, and later Carrot and Brain.js. These libraries support the creation and training and/or evolution of neural networks.

Through many hours of trial-and-error, many generations of NNs, and many CPU cycles of NN evolution and training, the current best NN successfully categorizes about 85% of Twitter users; which means that the current best NN and I agree about 85% of the time. The current best NN was created using Neataptic.

Some stats:

  • I have manually categorized over 65,000 Twitter users
  • number of NN inputs: between 1200–1500
  • NN input types: #hashtags, @usernames, URLs, Twitter “friends” IDs, emoji, Twitter Media IDs, words, n-grams, locations, Twitter Place IDs, language sentiment analysis
  • realtime tweets analyzed: about 5800/min

In future articles in this series, I’ll describe my app in more detail, talk about its development, current implementation and future plans in more detail.

Recommend reading:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store