NSFW.js: Machine Learning Applied to Indecent Content Detection

With the beta-released NSFW.js, developers can now include in their applications a client-side filter for indecent content. NSFW.js classifies images into one of five categories: Drawing, Hentai, Neutral, Porn, Sexy. Under some benchmarks, NSFW categorizes images with a 90% accuracy rate.

NSFW.js is an NPM module which, given an image element or canvas, will compute probabilities for the image to belong to any of five pre-determined categories:

NSFW image classification example

A typical usage on the client is as follows:

// Classic import style

import * as nsfwjs from 'nsfwjs'
// or just use require('nsfwjs')

// Load files from the server to the client!
const model = await nsfwjs.load('/model/')

// Gimme that image
const img = document.getElementById('questionable_img')

// Classify the image
const predictions = await model.classify(img)

// Share results
console.log('Predictions: ', predictions)

The model that NSFW.js leverages for its predictions is loaded onto the client. The classify function accepts an image as parameter and returns the computed predictions. The model is currently around 20MB, which may be an issue for users of any mobile application relying on NSFW capabilities. Gant Laborde, the library creator, comments:

It gets cached. I agree, though. Models need to get smaller.

NSFW may fit in a number of ways into the implementation of a content moderation policy. It may for instance be used as a pre-filter, sending to the moderation team only those content which pass a threshold of probability to belong in a category subject to moderation.

NSFW uses machine-learning algorithms to compute the model from which it performs its classification. Laborde explains:

it was trained using lots of images from Reddit depending on the sub. With about 300k images certain patterns emerged and those are what the model detects.

NSFW touts a 90% success rate on its test set. However, it still fails at correctly classifying images which a human would instantly and accurately assess. Laborde additionally mentions that the model may show bias:

The fun one was that Jeffrey Goldblum kept getting miscategorized, and the not-so-fun one was that the model was overly sensitive to females.

The false positives may however not be a significant issue for content filtering applications, provided there is a secondary human layer. With the continuous inclusion of more false positives into the training data set, and further fine-tuning of the model, NFSW accuracy is poised to continue to increase.

The five categories are defined as:

Drawing - safe for work drawings (including anime)
Hentai - hentai and pornographic drawings
Neutral - safe for work neutral images
Porn - pornographic images, sexual acts
Sexy - sexually explicit images, not pornography

NFSW.js is available under the MIT open source license. Contributions are welcome via the GitHub package.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the Web Development topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter