Using ml5.js, poseNet, p5.js to Turn Myself Into a Cat

Denisa Marcisovska
6 min readMay 26, 2020

A live example of the code: https://dmarcisovska.github.io/ml5-posenet-cat/

Github code: https://github.com/dmarcisovska/ml5-posenet-cat

I had a lot of fun creating my last ml5.js app — which classifies a photo using mobileNet. For this project, I wanted to dive deeper into ml5.js. After exploring the ml5.js website, I thought it would be fun to use the poseNet pre-trained model to draw features on my face — poseNet is a machine learning model that allows for real-time human pose estimation by finding different points on your body and face.

The ml5.js poseNet cat app I created turns the viewers’ face into a cat. I used PoseNet to find the eyes, and p5.js to draw cat eyes on the eye location. I did the same for the whiskers, nose and ears. I added in some extra math so the cat parts will resize depending on how far the user is away from the camera. I then added in some buttons that are linked to audio files so the user can have a meowing, hissing, or purring noise in the background if they choose.

HTML

To create the html framework I added in the external script tags needed to run ml5.js poseNet, p5.js, Bootstrap and Google Fonts. I added in three buttons that the user will be able to press to generate cat noises. Each of these buttons is tied to a function that activates an audio file. I also added in 3 audio file html elements into the page — these are hidden on the website.

<!DOCTYPE html>
<html>
<head>
<script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/0.10.2/p5.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/0.10.2/addons/p5.sound.min.js"></script>
<script src="https://unpkg.com/ml5@0.4.3/dist/ml5.min.js"></script>
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.4.1/css/bootstrap.min.css" integrity="sha384-Vkoo8x4CGsO3+Hhxv8T/Q5PaXtkKtu6ug5TOeNV6gBiFeWPGFN9MuhOf23Q9Ifjh" crossorigin="anonymous">
<link href='https://fonts.googleapis.com/css?family=Lato:300' rel='stylesheet' type='text/css'>
<link rel="stylesheet" type="text/css" href="styles.css">
<meta charset="utf-8" />
</head>
<body>
<script src="sketch.js"></script>
<a href="https://www.vecteezy.com/free-vector/seamless-pattern"> Wallpaper by Vecteezy</a>
<div class="container" id="button-group">
<button type="button" class="btn btn-info btn-lg mr-3" onclick="meow()">Meow</button>
<button type="button" class="btn btn-info btn-lg mr-3" onclick="purr()">Purr</button>
<button type="button" class="btn btn-info btn-lg" onclick="hiss()">Hiss</button>
</div>
<audio src="assets/meow.mp3" id="meow">
<p class="text-center">If you are reading this, it is because your browser does not support the audio element. </p>
</audio>
<audio src="assets/purr.mp3" id="purr">
<p class="text-center">If you are reading this, it is because your browser does not support the audio element. </p>
</audio>
<audio src="assets/hiss.wav" id="hiss">
<p class="text-center">If you are reading this, it is because your browser does not support the audio element. </p>
</audio>
</body>
</html>

CSS

I added in a background image of cats. I added in some styling for the canvas — I wanted it to look like a picture frame so I added in extra padding. I rotate the webcam screen slightly for a cool effect on the screen. I also pushed the buttons to the bottom of the page.

html, body {
margin: 0;
padding: 0;
font-family: 'Lato', sans-serif;
}
body{
background: url("assets/cats.jpg") no-repeat fixed 0 0 / cover;
}
canvas {
display: block;
}
canvas {
margin-top: 50px;
margin-left: auto;
margin-right: auto;
}canvas {
padding: 10px 10px 10px 10px;
background: white;
border: 1px solid #fff;
box-shadow: 0px 2px 15px #333;
-webkit-box-shadow: 0px 2px 15px #333;
-moz-box-shadow: 0px 2px 15px #333;
-webkit-transform: rotate(-5deg);
-moz-transform: rotate(-5deg);
font-family: 'Permanent Marker', cursive;
color: black;
font-size: 32px;
margin: 50px auto 50px auto;
}#button-group {
position: absolute;
bottom: 30px;
margin-left: auto;
margin-right: auto;
}
a {
color: black !important;
font-size: 12px;
}

JavaScript

I created 3 functions for activating the noises on the website — meow, purr, hiss. Whenever a user clicks on a button, it will activate the corresponding noise. In function setup the canvas is initiated and a size is set to it.

In the gotPoses function I check if there are poses present, if more than 0 poses are present then I assign a pose to the pose array. The pose array contains x, y coordinates (just like the x,y graphs we had to make back in middle/high school) for specific locations on the face and body. For example my left eye is located at: pose.leftEye.x, pose.leftEye.y, and my nose is located at the position: pose.nose.x, pose.nose.y.

In the draw function I calculate the distance between the eyes and set it to a variable of d. I use the d for resizing the shapes depending on how far or close I am to the screen.

The outer eye was created using the ellipse p5 shape. I used p5.js to create all the shapes. The first two numbers are the x and y coordinates, and the 2nd two numbers are the width and the height: ellipse(x, y, w, h). The width and the height are variables of the distance (variable d) between eyes, so the eyes would resize depending on how far or close a user is the screen. To calculate these d variables, I played around with different numbers until I reached a desired image result. I created the inner eye using a similar method. I used the ellipse from p5 again. I gave the inner eyes a shorter width, and a different color. The inner eyes width and height were set to variables of d as well so they will resize upon how far the user is from the computer.

To calculate the nose I used the triangle p5.js tool. The triangle in p5.js is calculated by 3 coordinates of where you want the triangle placed: triangle(x1, y1, x2, y2, x3, y3). I played around with different variables of d until the nose was in the right position. I used d variables so the nose would resize correctly.

I used a similar method for calculating inner and outer ears. I used the same triangle formula as above to calculate the ears as well. I played around with different coordinates until they were in the correct position. I used the nose as a reference from which to base the ear coordinates on, but could have also used eyes as a reference as well. I made the coordinates variables of d so they would resize. I made the outer ear larger than the inner ear. The outer ear is black and the inner ear is a salmon color.

To draw the whiskers I used the line function from p5.js: line(x1, y1, x2, y2). To do so I specified two sets of points I wanted the line to connect to. I made these points variables of d like with the other shapes.

let video;
let poseNet;
let pose;
function meow(){
const meow = document.getElementById('meow');
meow.play();
}
function purr(){
const purr = document.getElementById('purr');
purr.play();
}
function hiss(){
const hiss = document.getElementById('hiss');
hiss.play();
}
function setup() {
createCanvas(640, 480);
video = createCapture(VIDEO);
video.hide();
poseNet = ml5.poseNet(video, modelLoaded);
poseNet.on('pose', gotPoses);
}function gotPoses(poses) {
if (poses.length > 0) {
pose = poses[0].pose;
}
}
function modelLoaded() {
console.log('poseNet ready');
}
function draw() {
image(video, 0, 0);
if (pose) {
let eyeR = pose.rightEye;
let eyeL = pose.leftEye;
let d = dist(eyeR.x, eyeR.y, eyeL.x, eyeL.y);
//Outer Eye
fill(0);
ellipse(eyeR.x,eyeR.y,d/1.5, (d/2.5));
ellipse(eyeL.x,eyeR.y,d/1.5, d/2.5);
// Inner Eye
fill('#00ff80');
ellipse(eyeR.x - 1,eyeR.y, d/10, d/4);
ellipse(eyeL.x - 1,eyeR.y, d/10, d/4);
// Nose
fill(0);
triangle(pose.nose.x - (d/3), pose.nose.y - (d/10), pose.nose.x + (d/3), pose.nose.y -(d/10), pose.nose.x, pose.nose.y + (d/3));
//Outer Ear
triangle(pose.nose.x + d, pose.nose.y - (d*1.2), pose.nose.x + (d/1.2), pose.nose.y -(d*2), pose.nose.x + (d/5), pose.nose.y - (d*1.5));
triangle(pose.nose.x - d, pose.nose.y - (d*1.2), pose.nose.x - (d/1.2), pose.nose.y -(d*2), pose.nose.x - (d/5), pose.nose.y - (d*1.5));
// Inner Ear
fill('#FA8072');
triangle(pose.nose.x + (d*.9), pose.nose.y - (d*1.3), pose.nose.x + (d/1.3), pose.nose.y - (d*1.8), pose.nose.x + (d/3.33), pose.nose.y - (d*1.5));
triangle(pose.nose.x - (d*.9), pose.nose.y - (d*1.3), pose.nose.x - (d/1.3), pose.nose.y - (d*1.8), pose.nose.x - (d/3.33), pose.nose.y - (d*1.5));
// Whiskers
line(pose.nose.x + (d/2.5), pose.nose.y - (d/20), pose.nose.x + (d), pose.nose.y - (d/4));
line(pose.nose.x + (d/2.5), pose.nose.y + (d/20), pose.nose.x + d, pose.nose.y + (d/4));
line(pose.nose.x + (d/2.5), pose.nose.y, pose.nose.x + d, pose.nose.y);
line(pose.nose.x - (d/2.5), pose.nose.y - (d/20), pose.nose.x - d, pose.nose.y - (d/4));
line(pose.nose.x - (d/2.5), pose.nose.y + (d/20), pose.nose.x - d, pose.nose.y + (d/4));
line(pose.nose.x - (d/2.5), pose.nose.y, pose.nose.x - d, pose.nose.y);
}
}

Voilà! This was a really fun project and demonstrates the capabilities of poseNet.

--

--