The project aims to develop an Explainable Multimodal Large Language Model (MLLM) framework for the proactive detection of image-borne cybersecurity threats. Modern image-based attacks—such as zero-click exploits and AI-driven steganography—pose a severe challenge to Sweden’s digital infrastructure, as malicious code can be hidden within everyday image formats. Traditional defenses often fail against such evolving threats that lack known signatures or explicit indicators.
Our research will design and train an MLLM capable of learning a deep representation of “image normalcy” from large-scale benign datasets, enabling it to detect statistically and semantically abnormal image patterns that may signal embedded malicious payloads. The model will fuse multiple modalities—visual patterns, file structures, metadata, and byte-level characteristics—to achieve robust, unsupervised detection of known and novel (zero-day) threats.
Beyond identifying anomalies, the framework will also infer the functional intent of potential threats (e.g., downloader, exfiltrator, exploit vector) and generate interpretable, human-readable explanations that help security professionals understand and trust the model’s predictions.
This work will culminate in an open-source research toolkit providing Swedish institutions with actionable, explainable, and proactive capabilities to secure image pipelines across critical sectors such as identity verification, e-government, and digital forensics.