Section 1.1 — Comparing AI Types and Techniques Used in Cybersecurity

wpnews.pro

Hi, it's Furkan. I'm a security professional prepping for the CompTIA SecAI+ (CY0-001) cert, and I couldn't find study material that actually clicked for me, so I built my own and structured it around the exam blueprint. This is me sharing it back. Each post maps to one objective, and I've leaned hard on real-world scenarios because that's what made it stick for me. If it helps you pass too, even better.

CompTIA SecAI+ CY0-001 | Domain 1.0: Basic AI Concepts Related to Cybersecurity

"Compare and contrast various AI types and techniques used in cybersecurity."

This one breaks down into three big building blocks:

AI Types — which kind of AI does what, and where it shows up in security

Model Training Techniques — how a model actually gets trained and tuned

Prompt Engineering — how to ask an AI the right question

Ready? Let's get into it.

AI isn't one single thing. Different problems call for different AI approaches. Think of a carpenter's toolbox: there's a hammer, a screwdriver, a saw. They all do different jobs, but they're all "tools." AI types work the same way.

What it does: Creates new content. Text, images, audio, code; whatever you ask for.

How it works: Models trained on huge amounts of data use the patterns they've learned to produce outputs that never existed before. They don't memorize, they learn patterns and then build new things out of them.

In security:

Defense side: Building security awareness training by simulating phishing emails, auto-generating incident response playbooks, drafting security reports.

Offense side: Attackers use generative AI to spin up convincing phishing emails, deepfake audio, or malicious code.

🍕 Real-World Example:A security team wants to test their employees, so they use generative AI to write phishing emails that nail the CEO's writing style. "Hey, I need you to push through an urgent payment...", the email is so realistic that 40% of staff click. That right there is why you need to understand this techandbuild defenses against it.

What it does: Learns patterns from data and then makes predictions or decisions based on those patterns. Nobody hard-codes the rules, it "learns" them from the data.

The core idea: In traditional programming, you write the rules. In ML, you hand over the data and the model figures out the rules itself.

Traditional Programming:  Rules + Data → Output
Machine Learning:         Data + Output → Rules (Model)

In security:

Spam filtering (email classification)

Malware detection (file behavior analysis)

Network anomaly detection (catching deviations from normal traffic)

User behavior analytics (UBA); spotting when a user does something out of character

🍕 Real-World Example:A bank uses ML to learn its customers' normal spending habits. Dave grabs a coffee in the city every morning. Then one night at 3 AM, a $5,000 charge comes through from Brazil. The model goes: "Yeah, that's not right" and blocks the transaction and pings Dave.

What it does: Methods for drawing conclusions from data, grounded in statistical theory. Call it the mathematical backbone of ML.

How is it different from ML? Statistical learning leans more toward the "why" and cares about how interpretable the model is. ML usually leans toward the "what's going to happen" and puts predictive performance first. In practice there's no hard line between them, plenty of ML algorithms sit on statistical foundations anyway.

Core techniques: Regression, classification, clustering, hypothesis testing.

In security:

Statistical anomaly detection in log data

Risk scoring — calculating a user's or device's risk level with a statistical model

Baselining — defining "normal" behavior in statistical terms

🍕 Real-World Example:A SOC team analyzes the statistical distribution of DNS queries on the network. On a normal day they see around 50,000 queries with a standard deviation of 5,000. One day they spot 200,000 — that's 30 standard deviations out. Either there's a DDoS in progress, or some malware is chatting with a command & control (C2) server.

What it does: The architecture behind the modern AI revolution. Instead of reading words one at a time, it processes the whole thing at once and figures out how the words relate to each other.

Why it matters: It landed in 2017 with Google's "Attention Is All You Need" paper. Earlier models (RNNs, LSTMs) read text left to right, one step at a time and lost the thread on long sentences. The Transformer's "self-attention" mechanism works out how every word relates to every other word, all at once.

The key concept — self-attention: Take the sentence "Dave went to the bank because he wanted to deposit money." Self-attention is what lets the model know that "he" points back to "Dave." The Transformer is the architecture that can actually make that connection.

In security:

Phishing detection (understanding the context of the text, not just keywords)

Auto-analyzing threat intelligence reports

Malicious code analysis (grasping the semantic structure of code)

Log analysis and anomaly detection

🍕 Real-World Example:An old-school spam filter flagged any email with the word "free" in it. Problem is, it also flagged "this product is not free" , because it had no clue about context. A Transformer-based model reads "free" in context and makes the right call.

What it does: Uses neural networks stacked into many layers to learn complex patterns. The "deep" refers to how many layers deep the network goes.

How it relates to ML: Deep learning is a subset of machine learning. Every deep learning model is an ML model, but not every ML model is deep learning.

AI
 └── Machine Learning
      └── Deep Learning
           └── Transformers, CNN, RNN, GAN...

Why "deep"? Because the network has more than one hidden layer. A simple single-layer network can only learn simple patterns. But a deep network with 10, 50, even 100+ layers can pick up on absurdly complex ones.

In security:

Zero-day malware detection — catching never-before-seen malware by how it behaves

Network intrusion detection — spotting complex attack patterns in traffic

Image-based CAPTCHA-bypass detection

Deepfake detection (ironically, deep learning catches what deep learning made)

🍕 Real-World Example:Traditional antivirus is signature-based, it looks for the fingerprints of known bad files. But the moment an attacker tweaks the malware's code a little (polymorphic malware), the signature doesn't match and the AV whiffs. A deep-learning-based EDR (Endpoint Detection and Response) system instead watches the file'sbehavior: "This thing is trying to open a hidden network connection, and now it's starting to encrypt files..." that pattern looks like ransomware, shut it down.

What it does: Lets machines understand, process, and generate human language — both speech and text.

In security:

Detecting phishing emails and messages through linguistic analysis

Automatically reading and summarizing threat intel reports

Scraping dark web forums and pulling out threat information

Auto-generating reports for security incidents

There are three NLP sub-concepts that'll come up on the exam:

What it does: Broad, general-purpose language models trained with billions of parameters. A single model can handle understanding, generation, translation, summarization, and more.

Examples: GPT-4, Claude, Gemini, LLaMA

Characteristics:

Billions (sometimes trillions) of parameters

Trained on enormous datasets

General-purpose — one model handles lots of different tasks

Needs serious compute (GPU clusters)

In security:

Helping SOC analysts work through attack analysis

Drafting security policies

Automated triage during incident response

What it does: Slimmed-down, domain-focused versions of LLMs. Fewer parameters, narrower scope but more focused on the job at hand.

Examples: Phi-3, Gemma, Mistral 7B

LLM vs SLM:

Feature	LLM	SLM
Parameter count	Billions–Trillions	Millions–Few billion
Training cost	Very high	Relatively low
Where it runs	Cloud / GPU cluster	Edge device / single GPU
Task scope	General-purpose	Narrow, specific tasks
Latency	Can be high	Low
Privacy	Data may leave for the cloud	Can run locally

Why SLMs win in security:

Run locally, so sensitive data never has to leave the building

Real-time threat detection on edge devices (firewalls, IoT gateways)

Fast decisions thanks to low latency

🍕 Real-World Example:A military org wants to analyze classified documents but can't send the data to the cloud. So they stand up a local language model on their own servers using an SLM. It summarizes threat intel reports without ever creating a privacy breach.

What it does: Pits two neural networks against each other to produce realistic synthetic data.

How it works: Picture a forger and a detective:

Generator: Produces fake data — that's the forger.

Discriminator: Tries to tell real from fake — that's the detective.

The two are in a constant arms race. The forger gets better at faking, the detective gets better at spotting. By the end of that race, the forger is so good that telling real from fake is nearly impossible.

In security:

Deepfake creation and detection — GANs are both the creator of deepfakes and their nemesis

Generating synthetic attack data — when training data is scarce, GANs can manufacture realistic attack samples

Crafting adversarial examples — to stress-test ML-based security systems

Password cracking — GANs can learn realistic password patterns and supercharge cracking attacks

🍕 Real-World Example:A fintech company's fraud detection system doesn't have enough training data, because real fraud cases are rare. So they use a GAN to generate thousands of synthetic fraud scenarios and train the detection model on that. The result: real fraud detection jumps 35%.

For an AI model to be "smart," it has to be trained. In this part you'll learn how a model gets trained, validated, and fine-tuned.

What it does: Measures how well a trained model actually performs in the real world.

Why it matters: A model might have just memorized the training data (overfitting) — 99% accurate on training data, garbage the moment new data shows up. Model validation is how you catch that.

Core methods:

Train/Test Split: Split the data in two — 80% training, 20% test. The model learns from the training set and gets tested on the test set.

Cross-Validation: Split the data into K folds (say, 5). Each round, one fold is the test set and the rest is training. Repeat 5 times, then average the results. Gives you a more trustworthy read.

Holdout Validation: Set aside a chunk of data that the model never sees, and use it for the final evaluation.

Evaluation metrics:

Metric	What it measures	Security example
Accuracy
Share of all predictions that were correct	Overall malware-detection success
Precision
Of everything flagged "malicious," how much actually was	How few false positives?
Recall
Of all the real threats, how many you caught	How much malware slipped through?
F1 Score
The balanced average of precision and recall	Overall balance

🍕 Real-World Example:A malware detection model shows 99.9% accuracy. Looks amazing, right? Except 99.9% of the files in the dataset are clean and 0.1% are malicious. The model could just say "everything's clean" and still hit 99.9% accuracy. That's exactly why you never look at accuracy alone, you check precision and recall too. In security, recall usually matters more, because missing a real threat is far more dangerous than throwing a false alarm.

What it does: You feed the model labeled data, both the input and the correct answer. The model learns the relationship between the two.

Analogy: You show a kid pictures of animals: "This is a cat, this is a dog, this is a bird." After enough examples, you show a new picture and the kid goes "Cat!"

How it works:

Prep the labeled data (input → label)

Train the model on it

The model can now predict on new inputs

In security:

Spam/phishing detection — emails labeled "spam" or "not spam"

Malware classification — files labeled "malicious" or "clean"

Intrusion detection — network traffic labeled "attack" or "normal"

Upsides: High accuracy, interpretable results. Downsides: Collecting labeled data is expensive and slow, and labeling mistakes can wreck the model.

What it does: Works on unlabeled data. The model discovers the hidden patterns, groups, and structure in the data on its own.

Analogy: You hand a kid hundreds of animal pictures without naming a single one. The kid groups them anyway, "these look alike, these are different." Doesn't know the names, but finds the patterns.

Core techniques:

Clustering: Grouping similar data points together

Anomaly Detection: Learn what "normal" looks like, then flag anything that strays from it

Dimensionality Reduction: Representing high-dimensional data in fewer dimensions

In security:

Zero-day attack detection — catching never-before-seen attacks as "abnormal behavior"

User and Entity Behavior Analytics (UEBA) — modeling user behavior and flagging anomalies

Network traffic clustering — grouping similar traffic and investigating the odd clusters

🍕 Real-World Example:A company's SIEM uses unsupervised learning to learn employees' normal working hours and access patterns. One night, an account in the finance department starts hitting the engineering servers at 3 AM. The system flags it as an anomaly and the investigation reveals the account was compromised.

What it does: An "agent" takes actions in an environment and gets a reward or penalty depending on how those actions turn out. The goal is to maximize total reward.

Analogy: You're teaching a dog to sit. It sits, it gets a treat. It doesn't sit, no treat. Over time the dog connects the dots: "When I sit, I get the treat."

Agent → takes an action → something changes in the environment → reward/penalty → agent learns → repeat

In security:

Adaptive defense systems — auto-tuning defense strategy based on attack patterns

Automated penetration testing — an AI agent discovering and exploiting vulnerabilities in a network

Firewall rule optimization — automatically optimizing rules based on traffic

Dynamic honeypot management — changing honeypot behavior depending on the attacker

🍕 Real-World Example:A security firm builds an automated pentesting tool with reinforcement learning. The agent tries different attack vectors against a target network. A successful exploit earns +10 points; getting caught costs -5. Over time the agent learns which techniques to use in what order — doing in hours what would take a human pentester weeks.

What it does: Takes a pre-trained, general-purpose model and specializes it for a specific domain or task through extra training.

Analogy: Picture a doctor fresh out of med school — a generalist (the pre-trained model). Then they specialize in cardiology (fine-tuning). The foundational medical knowledge is still there, but now they're a heart expert.

Why not just train from scratch?

Training from scratch is brutally expensive (millions of dollars, weeks of GPU time)

Fine-tuning is far cheaper and faster

The foundational knowledge is already in the model — you're just adding the specialty

In security:

Fine-tuning a general LLM on threat intel reports to turn it into an attack-analysis expert

Building a customized model for malware analysis

Developing a model that speaks your org's specific security policies and terminology

There are three fine-tuning concepts you'll need for the exam:

What it means: One full pass of the entire training dataset through the model.

Analogy: Reading a book cover to cover = 1 epoch. Reading it 10 times = 10 epochs. Each pass, you understand it a little better.

Why it matters:

Too few epochs → the model doesn't learn enough (underfitting)

Too many epochs → the model starts memorizing (overfitting)

The right number → the model generalizes and works well on new data too

In practice: You usually stop training right when validation loss starts creeping back up (early stopping).

What it does: Removes the unnecessary or low-impact connections (weights/neurons) from a trained model to make it smaller and faster.

Analogy: Like pruning a tree. You cut off the dead branches, and the tree grows healthier and more efficiently. The "dead branches" in a model are the near-zero-weight connections that contribute nothing to the output.

Types:

Weight Pruning: Setting small weights to zero

Neuron/Filter Pruning: Removing entire neurons or filters

Structured vs. Unstructured: Dropping whole layers structurally vs. removing individual connections one by one

Why it matters in security:

Shrinking the model so it can run on edge devices (firewalls, IoT gateways)

Boosting inference speed for real-time threat detection

Cutting cloud costs

What it does: Lowers the numerical precision in the model to shrink its size and speed up computation.

Analogy: Instead of storing every point's GPS coordinate on a map at 10 decimal places, you store it at 2. You lose a bit of detail, but the map is way smaller and loads way faster.

The technical bit:

FP32 → FP16: Dropping from 32-bit floating point to 16-bit (model size halves)

FP32 → INT8: Dropping from 32-bit to 8-bit integer (model shrinks 4x)

FP32 → INT4: More aggressive shrinking, with a bit more accuracy loss

Why it matters in security:

Running AI-based security on low-power devices (IoT, mobile)

Cutting latency for real-time threat detection

Deploying big models more cost-effectively

🍕 Real-World Example:An IoT security company wants to run a deep learning model for traffic analysis on an edge gateway. The original model is 2 GB and only runs on a GPU server. With pruning + quantization (INT8), they get it down to 200 MB and running in real time on an ARM-based gateway. The accuracy hit is just 2%, an acceptable trade-off.

This is the art and science of working with AI models, LLMs especially. Writing the right prompt is the key to getting the right, useful output back. As a security pro, you've got to know prompt engineering to use AI tools effectively.

What it does: Defines the model's overall behavior, personality, and constraints. The user usually never sees it, it runs in the background.

Analogy: Like the job description and rules you give an employee before they start. "You're a customer service rep. Always be polite. Never share pricing. Route technical questions to engineering."

Example:

You are a cybersecurity expert. When informing users about
vulnerabilities, do not share specific exploit code that could be
actively abused. Always lead with defense-focused recommendations.

Security angle:

System prompts set the model's security boundaries

A poorly written system prompt can be wide open to prompt injection

Putting sensitive info in the system prompt is a risk in itself

What it does: The actual question or request the user sends to the model. It gets processed within the rules the system prompt laid down.

Example:

Analyze this log entry and identify any potential security threats:
[2024-01-15 03:22:11] Failed login attempt from IP 185.234.xx.xx - User: admin
[2024-01-15 03:22:13] Failed login attempt from IP 185.234.xx.xx - User: admin
[2024-01-15 03:22:14] Failed login attempt from IP 185.234.xx.xx - User: root
... (500 more lines)

What makes a good user prompt:

Clear and specific

Carries context (what you want, in what format)

Provides the data the model needs

What it does: You give the model a task with zero examples, just the instruction.

Analogy: Telling someone "analyze whether this email is phishing"; no examples up front, just the task.

Example:

Analyze the email below and determine whether it's phishing:

"Dear user, we've detected a suspicious login on your account.
To verify your account, please click the link below:
http://secure-bank-verify.suspicious-domain.com/login"

When to use it: Simple, well-defined tasks where the model's general knowledge is enough.

Upside: Fast, easy. Downside: Accuracy can drop on complex or specialized tasks.

What it does: You give the model a single example and ask it to do the same kind of task.

Analogy: You show someone once: "Look, this phishing email is malicious for these reasons." Then you hand them another and say, "Now analyze this one the same way."

Example:

Do a security log analysis like the example below:

Example:
Log: "Failed SSH login from 10.0.0.5 to 10.0.0.1 (user: root) x 50 in 2 min"
Analysis: Brute force attack. Block the source IP. Disable SSH root login.
Install Fail2ban. Add MFA.

Now analyze this:
Log: "Outbound DNS queries to 185.xx.xx.xx:53 with TXT records averaging 500 bytes
every 30 seconds from host WORKSTATION-42"

When to use it: When you want the model's output in a particular format or approach.

What it does: You give the model several examples so it learns the pattern better.

Analogy: Like showing a new intern not one case but five different ones, "see the common thread in all of them? Now you try."

Example:

Classify the following security events by severity:

Example 1:
Event: "User entered wrong password (once)"
Severity: LOW

Example 2:
Event: "500 failed login attempts from the same IP in 5 minutes"
Severity: HIGH

Example 3:
Event: "VPN connection on the admin account outside business hours"
Severity: MEDIUM

Now classify:
Event: "10 GB data transfer from the database server to an external IP at 2 AM"

When to use it: Complex classification tasks, when you need consistent formatting, or when the model has to learn a specific decision logic.

Upside: Highest accuracy and consistency. Downside: Longer prompt, more token usage (= more cost).

What it does: You assign the model a specific area of expertise or persona. This shapes the tone, depth, and focus of its answers.

Example:

You are a seasoned SOC Tier 3 analyst with 15 years of incident
response experience. You know the MITRE ATT&CK framework cold.
In every response, you always:
1. First, identify the threat's MITRE ATT&CK tactic and technique
2. Then assess the impact
3. Finally, lay out the containment and remediation steps

In security:

Assigning AI chatbots a security-expert role

Different roles for different scenarios: pentester, SOC analyst, compliance officer

Models with no role assigned tend to give shallower, more generic answers

What it does: Pre-defined, reusable structures for prompts. Patterns with variables in them.

Analogy: Like an incident response report template, instead of writing it from scratch every time, you fill in the blanks.

Example template:

### Security Incident Analysis Template

**Incident Type:** {incident_type}
**Source IP:** {source_ip}
**Target System:** {target_system}
**Time:** {timestamp}
**Log Data:** {log_data}

Using the information above, please:
1. Assess the severity (LOW/MEDIUM/HIGH/CRITICAL)
2. Identify the likely attack vector
3. List the immediate actions to take
4. Offer long-term remediation recommendations

In security:

Getting SOC teams consistent analysis out of their AI tools

Standardized AI queries during incident response

Building repeatable, auditable prompt structures

Reducing prompt injection risk — user input gets placed in a controlled slot inside the template

🍕 Real-World Example:An MSSP (Managed Security Service Provider) sets up a template system to handle the hundreds of security incidents coming in from clients. Analysts just paste the log data into the template — the AI produces analysis and recommendations in a consistent, standard format. Now the output from 10 different analysts lines up, and SLA times drop by 60%.

AI Type	What it does	Security example
Generative AI	Creates new content	Phishing simulation, report generation
Machine Learning	Learns patterns from data, makes predictions	Spam filtering, malware detection
Statistical Learning	Analyzes with statistical methods	Anomaly detection, risk scoring
Transformers	Processes contextual relationships in parallel	Advanced text analysis, log analysis
Deep Learning	Learns complex patterns via multi-layer networks	Zero-day detection, deepfake detection
NLP	Processes and understands human language	Phishing detection, threat intel analysis
LLM	Large-scale language model	SOC assistant, policy generation
SLM	Small, focused language model	Edge security, on-prem analysis
GAN	Generates realistic synthetic data	Deepfakes, synthetic training data

Technique	Data type	When to use
Supervised Learning	Labeled	Classifying known threats
Unsupervised Learning	Unlabeled	Discovering unknown threats
Reinforcement Learning	Reward/penalty signal	Adaptive defense systems
Fine-tuning	Domain-specific data	Specializing a general model

Technique	Example count	Accuracy	Cost
Zero-shot	0	Low–Medium	Lowest
One-shot	1	Medium	Low
Multi-shot	2+	High	High

💡 Tip 1:When you see "compare and contrast" on the exam, they expect you to know both the differencesandthe similarities. Know supervised vs. unsupervised, LLM vs. SLM, and zero-shot vs. multi-shot especially well.

💡 Tip 2:Don't mix up pruning and quantization. Pruning cuts out unnecessary connections (a structural change); quantization lowers numerical precision (a precision change). Both shrink the model, but in different ways.

💡 Tip 3:Prompt engineering questions may ask you to pick the best technique for a scenario. "You've got no examples but you need a fast result" → zero-shot. "You want consistent, specific output and you've got a few examples" → multi-shot.

💡 Tip 4:Remember that GANs show up on both offense and defense. Don't pigeonhole the GAN as just an attack tool on the exam — defensive uses like generating synthetic training data are just as critical.

💡 Tip 5:Know the difference between a system prompt and a user prompt cold. The system prompt defines the model's "identity" and is usually invisible to the user. The user prompt is the user's actual request. Security-wise, hijacking the system prompt (prompt injection) is a serious risk.

The concept that fine-tuning is built on. Knowledge learned for one task gets carried over to another. For example, taking a general text-understanding ability and turning it into a security-log-analysis ability. When a fine-tuning question comes up on the exam, it helps to know the transfer learning behind it.

The process of turning text, words, or other data into numerical vectors. The words "phishing" and "credential theft" end up close together in vector space. In security, embeddings are used to find similar attack patterns. We'll dig into this more in Section 1.2, but it's worth grasping the basic idea here too.

The heart of the Transformer architecture. It lets the model pay different levels of "attention" to different parts of the input. It's the mechanism that answers "which piece of info in this log line is critical?" Even if it isn't asked directly on the exam, you need it to actually understand Transformers.

Overfitting: The model memorized the training data and falls apart on new data. Too many epochs, too complex a model.

Underfitting: The model didn't learn enough and fails on both training and test data. Too few epochs, too simple a model.

These show up in model validation and fine-tuning questions.

Drilling the material: Reading is one thing, recall is another. I built BREACH // PROTOCOL, a roguelite-style question app (spaced repetition, active recall, exam sim mode) to actually drill this stuff, it's free and open source. → https://github.com/Furkan-Taskin/breach-protocol

More sections dropping in this series. Follow if you're on the same grind.

source & further reading

dev.to — original article your CI agent is reading more than your prompt I automated everything except the code, and that's where Claude Code actually paid off One command for 13 AI coding-assistant context files

Section 1.1 — Comparing AI Types and Techniques Used in Cybersecurity

Run your AI side-project on zahid.host