Deepfake detection problem from R

May 2, 2025

27

Deepfake detection problem from R

Introduction

Working with video datasets, notably with respect to detection of AI-based pretend objects, may be very difficult resulting from correct body choice and face detection. To strategy this problem from R, one could make use of capabilities supplied by OpenCV, magick, and keras.

Our strategy consists of the next consequent steps:

learn all of the movies
seize and extract pictures from the movies
detect faces from the extracted pictures
crop the faces
construct a picture classification mannequin with Keras

Let’s rapidly introduce the non-deep-learning libraries we’re utilizing. OpenCV is a pc imaginative and prescient library that features:

Alternatively, magick is the open-source image-processing library that may assist to learn and extract helpful options from video datasets:

Learn video recordsdata
Extract pictures per second from the video
Crop the faces from the photographs

Earlier than we go into an in depth rationalization, readers ought to know that there isn’t a have to copy-paste code chunks. As a result of on the finish of the publish one can discover a hyperlink to Google Colab with GPU acceleration. This kernel permits everybody to run and reproduce the identical outcomes.

Knowledge exploration

The dataset that we’re going to analyze is offered by AWS, Fb, Microsoft, the Partnership on AI’s Media Integrity Steering Committee, and numerous lecturers.

It incorporates each actual and AI-generated pretend movies. The entire measurement is over 470 GB. Nonetheless, the pattern 4 GB dataset is individually accessible.

The movies within the folders are within the format of mp4 and have numerous lengths. Our process is to find out the variety of pictures to seize per second of a video. We often took 1-3 fps for each video.

Be aware: Set fps to NULL if you wish to extract all frames.

video = magick::image_read_video("aagfhgtpmv.mp4",fps = 2)
vid_1 = video[[1]]
vid_1 = magick::image_read(vid_1) %>% image_resize('1000x1000')

We noticed simply the primary body. What about the remainder of them?

Wanting on the gif one can observe that some fakes are very straightforward to distinguish, however a small fraction appears fairly practical. That is one other problem throughout information preparation.

Face detection

At first, face places should be decided through bounding containers, utilizing OpenCV. Then, magick is used to mechanically extract them from all pictures.

# get face location and calculate bounding field
library(opencv)
unconf  ocv_read('frame_1.jpg')
faces  ocv_face(unconf)
facemask  ocv_facemask(unconf)
df = attr(facemask, 'faces')
rectX = (df$x - df$radius) 
rectY = (df$y - df$radius)
x = (df$x + df$radius) 
y = (df$y + df$radius)

# draw with purple dashed line the field
imh  = image_draw(image_read('frame_1.jpg'))
rect(rectX, rectY, x, y, border = "purple", 
     lty = "dashed", lwd = 2)
dev.off()

If face places are discovered, then it is extremely straightforward to extract all of them.

edited = image_crop(imh, "49x49+66+34")
edited = image_crop(imh, paste(x-rectX+1,'x',x-rectX+1,'+',rectX, '+',rectY,sep = ''))
edited

Deep studying mannequin

After dataset preparation, it’s time to construct a deep studying mannequin with Keras. We will rapidly place all the photographs into folders and, utilizing picture turbines, feed faces to a pre-trained Keras mannequin.

train_dir = 'fakes_reals'
width = 150L
top = 150L
epochs = 10

train_datagen = image_data_generator(
  rescale = 1/255,
  rotation_range = 40,
  width_shift_range = 0.2,
  height_shift_range = 0.2,
  shear_range = 0.2,
  zoom_range = 0.2,
  horizontal_flip = TRUE,
  fill_mode = "nearest",
  validation_split=0.2
)


train_generator  flow_images_from_directory(
  train_dir,                  
  train_datagen,             
  target_size = c(width,top), 
  batch_size = 10,
  class_mode = "binary"
)

# Construct the mannequin ---------------------------------------------------------

conv_base  application_vgg16(
  weights = "imagenet",
  include_top = FALSE,
  input_shape = c(width, top, 3)
)

mannequin  keras_model_sequential() %>% 
  conv_base %>% 
  layer_flatten() %>% 
  layer_dense(items = 256, activation = "relu") %>% 
  layer_dense(items = 1, activation = "sigmoid")

mannequin %>% compile(
  loss = "binary_crossentropy",
  optimizer = optimizer_rmsprop(lr = 2e-5),
  metrics = c("accuracy")
)

historical past  mannequin %>% fit_generator(
  train_generator,
  steps_per_epoch = ceiling(train_generator$samples/train_generator$batch_size),
  epochs = 10
)

Reproduce in a Pocket book

Conclusion

This publish exhibits learn how to do video classification from R. The steps had been:

Learn movies and extract pictures from the dataset
Apply OpenCV to detect faces
Extract faces through bounding containers
Construct a deep studying mannequin

Nonetheless, readers ought to know that the implementation of the next steps could drastically enhance mannequin efficiency:

extract the entire frames from the video recordsdata
load totally different pre-trained weights, or use totally different pre-trained fashions
use one other know-how to detect faces – e.g., “MTCNN face detector”

Be at liberty to attempt these choices on the Deepfake detection problem and share your ends in the feedback part!

Thanks for studying!

Corrections

When you see errors or wish to recommend modifications, please create a problem on the supply repository.

Reuse

Textual content and figures are licensed beneath Artistic Commons Attribution CC BY 4.0. Supply code is obtainable at https://github.com/henry090/Deepfake-from-R, except in any other case famous. The figures which were reused from different sources do not fall beneath this license and will be acknowledged by a word of their caption: “Determine from …”.

Quotation

For attribution, please cite this work as

Abdullayev (2020, Aug. 18). Posit AI Weblog: Deepfake detection problem from R. Retrieved from https://blogs.rstudio.com/tensorflow/posts/2020-08-18-deepfake/

BibTeX quotation

@misc{abdullayev2020deepfake,
  creator = {Abdullayev, Turgut},
  title = {Posit AI Weblog: Deepfake detection problem from R},
  url = {https://blogs.rstudio.com/tensorflow/posts/2020-08-18-deepfake/},
  12 months = {2020}
}

Previous articleMicrosoft fixes Alternate On-line bug flagging Gmail emails as spam

Next articleAlpha Powders Turns Waste into Efficiency with 3D Printing Prepared Polymers

Deepfake detection problem from R

Introduction

Knowledge exploration

Face detection

Deep studying mannequin

Conclusion

Corrections

Reuse

Quotation

The Debrief: Energy and power

ETH and Stanford Researchers Introduce MIRIAD: A 5.8M Pair Dataset to Enhance LLM Accuracy in Medical AI

Automate Knowledge High quality Stories with n8n: From CSV to Skilled Evaluation

LEAVE A REPLY Cancel reply

Most Popular

High 7 Rerankers for RAG

Taking the shine off BreachForums – Sophos Information

At this time’s NYT Mini Crossword Solutions for June 27

Oppo K13x 5G With Dimensity 6300 SoC Sale in India Begins Immediately: Value, Availability

Recent Comments

ABOUT US

POPULAR POSTS

High 7 Rerankers for RAG

Taking the shine off BreachForums – Sophos Information

At this time’s NYT Mini Crossword Solutions for June 27

POPULAR CATEGORY