The internet is currently losing its mind over a 39-second black-and-white clip of Elon Musk and Nikhil Kamath laughing in complete silence.1 Is it a real podcast teaser? Or is it a high-end AI generation from Grok 3 or Sora?
While Twitter argues, developers can solve this with code.
One of the most common failures in AI-generated video is physiological inconsistency. AI models are great at rendering textures, but they often forget “biological rules”โspecifically, blinking.2 Humans blink spontaneously every 2โ10 seconds. AI avatars often stare unblinkingly for unnaturally long periods or blink with irregular, “morphing” eyelids.
In this tutorial, Iโll show you how to build a Python script that analyzes that viral video frame-by-frame to count blinks. If Elon doesn’t blink for 39 seconds, we have our answer.
The Logic: The Eye Aspect Ratio (EAR)
We don’t need to train a massive neural network. We can use a simple geometric metric called the Eye Aspect Ratio (EAR).3
We map 6 facial landmarks around the eye.4 The EAR is calculated using the distance between vertical points versus horizontal points.
- When the eye is open: The vertical distance is large, so the EAR is high (approx 0.30).5
- When the eye closes: The vertical distance drops to near zero, so the EAR plummets.6
If the graph of EAR over time stays flat, you are likely looking at a Deepfake.
Step 1: The Setup
You will need a few standard computer vision libraries. Open your terminal (or Google Colab) and run:
Bash
pip install opencv-python dlib imutils scipy
Note: dlib can be tricky to install on Windows. If you get errors, you may need to install CMake first, or just run this in a Google Colab notebook where it works out of the box.
You will also need the pre-trained face landmark file. Download shape_predictor_68_face_landmarks.dat (itโs widely available on GitHub/HuggingFace).7
Step 2: The Deepfake Detector Script
Create a file named detector.py and paste the following code. I have optimized this to work specifically on video files like the downloaded Twitter clip.
Python
import cv2
import dlib
import numpy as np
from scipy.spatial import distance as dist
from imutils import face_utils
# --- CONFIGURATION ---
# EAR threshold: Below this, we count it as a "closed eye"
EYE_AR_THRESH = 0.25
# Consecutive frames: How long the eye must be closed to count as a blink
EYE_AR_CONSEC_FRAMES = 3
def eye_aspect_ratio(eye):
# Calculate vertical distances
A = dist.euclidean(eye[1], eye[5])
B = dist.euclidean(eye[2], eye[4])
# Calculate horizontal distance
C = dist.euclidean(eye[0], eye[3])
# Compute ratio
ear = (A + B) / (2.0 * C)
return ear
# Load Face Detectors
print("[INFO] Loading facial landmark predictor...")
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")
# Get array indexes for left and right eyes
(lStart, lEnd) = face_utils.FACIAL_LANDMARKS_IDXS["left_eye"]
(rStart, rEnd) = face_utils.FACIAL_LANDMARKS_IDXS["right_eye"]
# Load the Viral Video
cap = cv2.VideoCapture("musk_kamath_clip.mp4")
blink_count = 0
counter = 0
while True:
ret, frame = cap.read()
if not ret:
break
# Resize for faster processing
frame = cv2.resize(frame, (800, 600))
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Detect faces
rects = detector(gray, 0)
for rect in rects:
shape = predictor(gray, rect)
shape = face_utils.shape_to_np(shape)
# Extract eye coordinates
leftEye = shape[lStart:lEnd]
rightEye = shape[rStart:rEnd]
# Calculate EAR for both eyes
leftEAR = eye_aspect_ratio(leftEye)
rightEAR = eye_aspect_ratio(rightEye)
# Average the EAR together
ear = (leftEAR + rightEAR) / 2.0
# VISUALIZATION: Draw contours around eyes
leftEyeHull = cv2.convexHull(leftEye)
rightEyeHull = cv2.convexHull(rightEye)
cv2.drawContours(frame, [leftEyeHull], -1, (0, 255, 0), 1)
cv2.drawContours(frame, [rightEyeHull], -1, (0, 255, 0), 1)
# LOGIC: Check for blink
if ear < EYE_AR_THRESH:
counter += 1
else:
if counter >= EYE_AR_CONSEC_FRAMES:
blink_count += 1
# Visual Alert
cv2.putText(frame, "BLINK DETECTED", (10, 30),
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
counter = 0
# Display Stats
cv2.putText(frame, f"Blinks: {blink_count}", (10, 450),
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)
cv2.putText(frame, f"EAR: {ear:.2f}", (300, 30),
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 255, 255), 2)
cv2.imshow("Deepfake Detector", frame)
# Press 'q' to exit
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Step 3: Running the Analysis
- Download the Clip: Save the viral video from X/Twitter as
musk_kamath_clip.mp4in the same folder as your script. - Run the Script: Execute
python detector.py. - Watch the Overlay: You will see green lines drawn around Elon and Nikhil’s eyes.
How to Interpret the Results
- The “Human” Result: A normal person blinks roughly 15โ20 times per minute. In a 40-second clip, you should see at least 4 to 8 blinks. The EAR number should fluctuate constantly.
- The “AI” Result: If the
Blinkscounter stays at 0 or 1 for the entire duration, or if the EAR value stays “stuck” at 0.30 without dipping, it is highly probable that the video is AI-generated (likely using an image-to-video model like Luma or Runway Gen-3 which animates faces but often forgets blink physics).
Why This Matters
We are entering an era where we can no longer trust our eyes. By building tools like this, we move from passive consumers of content to active analysts. This simple script is your first line of defense against the misinformation age.