How to Extract Image Metadata in Python

Abdeladim Fadheli · 4 min read · Updated jul 2022 · Ethical Hacking · Web Scraping · Digital Forensics

Want to code faster? Our Python Code Generator lets you create Python scripts with just a few clicks. Try it now!

Disclosure: This post may contain affiliate links, meaning when you click the links and make a purchase, we receive a commission.

In this tutorial, you will learn how you can extract some useful metadata within images using the Pillow library in Python.

Devices such as digital cameras, smartphones, and scanners use the EXIF standard to save images or audio files. This standard contains many useful tags to extract, which can be useful for forensic investigation, such as the make, model of the device, the exact date and time of image creation, and even the GPS information on some devices.

Please note that there are free tools to extract metadata such as ImageMagick or ExifTool on Linux, the goal of this tutorial is to extract metadata with the Python programming language.

Get -35 OFF Now: Ethical Hacking with Python EBook

To get started, you need to install Pillow library:

pip3 install Pillow

Open up a new Python file and follow along:

from PIL import Image
from PIL.ExifTags import TAGS

Now this will only work on JPEG image files, take any image you took and test it for this tutorial (if you want to test on my image, you'll find it in the tutorial's repository):

# path to the image or video
imagename = "image.jpg"

# read the image data using PIL
image = Image.open(imagename)

We loaded the image using the Image.open() method. Before calling the getexif() function, the Pillow library has some attributes on the image object, let's print them out:

# extract other basic metadata
info_dict = {
    "Filename": image.filename,
    "Image Size": image.size,
    "Image Height": image.height,
    "Image Width": image.width,
    "Image Format": image.format,
    "Image Mode": image.mode,
    "Image is Animated": getattr(image, "is_animated", False),
    "Frames in Image": getattr(image, "n_frames", 1)
}

for label,value in info_dict.items():
    print(f"{label:25}: {value}")

Get Now: Ethical Hacking with Python EBook

Now let's call the getexif() method on the image which returns image metadata:

# extract EXIF data
exifdata = image.getexif()

The problem with exifdata variable now is that the field names are just IDs, not a human-readable field name, that's why we gonna need the TAGS dictionary from PIL.ExifTags module which maps each tag ID into a human-readable text:

# iterating over all EXIF data fields
for tag_id in exifdata:
    # get the tag name, instead of human unreadable tag id
    tag = TAGS.get(tag_id, tag_id)
    data = exifdata.get(tag_id)
    # decode bytes 
    if isinstance(data, bytes):
        data = data.decode()
    print(f"{tag:25}: {data}")

Here is my output:

Filename                 : .\image.jpg
Image Size               : (5312, 2988)       
Image Height             : 2988
Image Width              : 5312
Image Format             : JPEG
Image Mode               : RGB
Image is Animated        : False
Frames in Image          : 1
ExifVersion              : 0220
ShutterSpeedValue        : 4.32
ApertureValue            : 1.85
DateTimeOriginal         : 2016:11:10 19:33:22
DateTimeDigitized        : 2016:11:10 19:33:22
BrightnessValue          : -1.57
ExposureBiasValue        : 0.0
MaxApertureValue         : 1.85
MeteringMode             : 3
Flash                    : 0
FocalLength              : 4.3
ColorSpace               : 1
ExifImageWidth           : 5312
FocalLengthIn35mmFilm    : 28
SceneCaptureType         : 0
ImageWidth               : 5312
ExifImageHeight          : 2988
ImageLength              : 2988
Make                     : samsung
Model                    : SM-G920F
Orientation              : 1
YCbCrPositioning         : 1
XResolution              : 72.0
YResolution              : 72.0
ImageUniqueID            : A16LLIC08SM A16LLIL02GM

ExposureProgram          : 2
ISOSpeedRatings          : 640
ResolutionUnit           : 2
ExposureMode             : 0
FlashPixVersion          : 0100
WhiteBalance             : 0
Software                 : G920FXXS4DPI4
DateTime                 : 2016:11:10 19:33:22
ExifOffset               : 226
MakerNote                : 0100 
                                Z@P
UserComment              :
ExposureTime             : 0.05
FNumber                  : 1.9

A bunch of useful stuff; by quickly googling the Model, I concluded that this image was taken by a Samsung Galaxy S6. Run this on images that were captured by other devices, and you'll see different (maybe more) fields.

Alright, we're done. A good challenge for you is to download all images from a URL and then run this tutorial's script on every image you find and investigate the interesting results!

Want to Learn More?

If you're a beginner and want to learn Python, I suggest you take the Python For Everybody Coursera course, in which you'll learn a lot about Python. You can also check our resources and courses page to see the Python resources I recommend!

Finally, we have an EBook that is for ethical hackers like you, where we build 24 hacking tools with Python from scratch! Make sure to check it out here.

Learn also: How to Use Steganography to Hide Secret Data in Images in Python.

Happy Coding ♥

Loved the article? You'll love our Code Converter even more! It's your secret weapon for effortless coding. Give it a whirl!

View Full Code Transform My Code

Sharing is caring!

Comment panel

Got a coding query or need some guidance before you comment? Check out this Python Code Assistant for expert advice and handy tips. It's like having a coding tutor right in your fingertips!