Security Checks with Local LLMs

An experiment using local Large Language Models (LLMs) to perform security checks and code quality reviews, motivated by rising costs and new limits for cloud-based LLM APIs. The author selected the `qwen2.5-coder:14b-instruct-q5_K_M` model via Ollama, running on a MacBook Air M5 with 24GB RAM, and created a custom bash script to automate file scanning with configurable prompts and cooldown delays. The article concludes that using a 32k context window provides a good balance between execution speed and hardware temperature, with plans for further experimentation.

Continuing articles AI-Powered Repository Security Check with Antigravity Workflow https://dev.to/gdg/ai-powered-repository-security-check-with-antigravity-workflow-5hee and https://dev.to/gdg/how-to-build-a-custom-ai-quality-gate-on-cloud-run-from-zero-to-production-1odp https://dev.to/gdg/how-to-build-a-custom-ai-quality-gate-on-cloud-run-from-zero-to-production-1odp I've decided to try to outsource some checks to local LLM. This article describes my experiment and outcomes. Will be glad to read your questions, proposals, opinions or advices 🙌 You can listen a podcast generated based on this publication thanks NotebookLM : Intro Last changes in limits management for popular LLM APIs make me thinking about FinOps management. Why should I spend expensive cloud tokens for simple tasks? Also I have a lot of talks at last security and AI events which led me to begin experiments with local LLMs in terms of code generation and code quality checks. Hardware The hardware for experiments is MacBook Air M5 24GB RAM. I bought it especially for diving into ML topics but it was underloaded since today. Pains The first pain was an introduction of new limits for the Antigravity IDE. Along with models list changing it led me to think about optimizing my development and security flows which were intended to use cheaper Antigravity tokens prior to more expensive Vertex AI tokens. The second pain was the FOMO effect about Machine Learning and MLOps itself. Solution Track After some iterations with Ollama and local models I've selected the qwen2.5-coder:14b-instruct-q5 K M as a base model with optimized context window: % cat Modelfile-qwen-32k FROM qwen2.5-coder:14b-instruct-q5 K M PARAMETER num ctx 32000 % ollama create qwen-coder-32k -f ./Modelfile-qwen-32k ... % ollama list NAME ID SIZE MODIFIED qwen-coder-32k:latest dc3c4762d967 10 GB 2 hours ago qwen-coder-64k:latest 42f060e717dd 10 GB 2 hours ago qwen2.5-coder:14b-instruct-q5 K M 05d16c5ac1c1 10 GB 2 hours ago gemma4:e4b c6eb396dbd59 9.6 GB 25 hours ago gemma4:e2b 7fbdbf8f5e45 7.2 GB 25 hours ago The 32k window provided me with quite quick execution and a trade-off between the speed and the temperature of my laptop. I think this configuration will be a subject of experiments in near future. Then I've realized that I have to decompose tasks and give some rest time between requests to my hardware. So the unified script was born: bash /bin/bash Default values OUTPUT DIR="." MODEL NAME="qwen-coder-32k" COEFF=2 PROMPT FILE="" show help { echo "Usage: $0 -d <directory -m <file mask -p <prompt file OPTIONS " echo "" echo "Required parameters:" echo " -d Directory for searching files" echo " -m File mask to check" echo " -p Path to a text file with system prompt e.g., prompts/strict table.txt " echo "" echo "Optional parameters:" echo " -o Directory to save the final report default: current directory " echo " -e Exclude directories comma-separated, e.g., venv,tests,migration " echo " -f Exclude file masks comma-separated, e.g., test , init .py " echo " -c Cooldown delay multiplier default: 2 " exit 1 } Argument parsing while getopts "d:m:o:e:f:c:p:h" opt; do case "$opt" in d SRC DIR="$OPTARG" ;; m FILE MASK="$OPTARG" ;; o OUTPUT DIR="$OPTARG" ;; e EXCLUDE DIRS="$OPTARG" ;; f EXCLUDE FILES="$OPTARG" ;; c COEFF="$OPTARG" ;; p PROMPT FILE="$OPTARG" ;; h show help ;; show help ;; esac done Check required parameters if -z "$SRC DIR" || -z "$FILE MASK" || -z "$PROMPT FILE" ; then echo "❌ Error: Required parameters -d, -m, or -p are missing." show help fi Check if prompt file exists if -f "$PROMPT FILE" ; then echo "❌ Error: Prompt file '$PROMPT FILE' not found " exit 1 fi Check Ollama if pgrep -x "ollama" /dev/null && curl -s http://localhost:11434 /dev/null; then echo "❌ Error: Ollama is not running " exit 1 fi Check jq if command -v jq & /dev/null; then echo "❌ Error: 'jq' utility is not installed. Run: brew install jq" exit 1 fi Initialize report directory mkdir -p "$OUTPUT DIR" TIMESTAMP=$ date +%Y%m%d %H%M%S REPORT FILE="$OUTPUT DIR/review report $TIMESTAMP.md" Write report header { echo " 🛡️ Review Report" echo "Generation date: $ date " echo "Used prompt: \ $PROMPT FILE\ " echo -e "\n---\n" } "$REPORT FILE" echo "==================================================================" echo "🕵️‍♂️ Starting review..." echo "📂 Final report will be saved to: $REPORT FILE" echo "==================================================================" Build find command FIND CMD="find \"$SRC DIR\" -type f -name \"$FILE MASK\"" if -n "$EXCLUDE DIRS" ; then IFS=',' read -ra DIRS <<< "$EXCLUDE DIRS" FOR FIND="" for dir in "${DIRS @ }"; do if -z "$FOR FIND" ; then FOR FIND="-path ' /$dir/ '" else FOR FIND="$FOR FIND -o -path ' /$dir/ '" fi done FIND CMD="find \"$SRC DIR\" \ $FOR FIND \ -prune -o -type f -name \"$FILE MASK\" -print" fi Start main file processing loop eval "$FIND CMD" | while read -r file; do if -f "$file" ; then continue; fi Check file exclusions if -n "$EXCLUDE FILES" ; then IFS=',' read -ra FILE MASKS <<< "$EXCLUDE FILES" skip file=false for mask in "${FILE MASKS @ }"; do if "$ basename "$file" " == $mask ; then skip file=true break fi done if "$skip file" = true ; then echo "⏭️ Skipping file excluded by mask : $file" continue fi fi echo -n "⏳ Analyzing: $file ... " Read code and clear comments/empty lines CLEANED CODE=$ sed -e 's/ :space: . //' -e '/^ :space: $/d' "$file" if -z "$CLEANED CODE" ; then echo "⚠️ Empty." continue fi Write file section to report { echo " 📁 File: $file" echo -e "\n 🔍 Analysis results:\n" } "$REPORT FILE" Read external prompt and combine with code SYSTEM PROMPT=$ cat "$PROMPT FILE" FULL PROMPT="$SYSTEM PROMPT\n\n--- TARGET CODE ---\n$CLEANED CODE" JSON PAYLOAD=$ jq -n --arg model "$MODEL NAME" --arg prompt "$FULL PROMPT" '{model: $model, prompt: $prompt, stream: false}' Measure time and send API request START TIME=$ date +%s curl -s -X POST http://localhost:11434/api/generate -H "Content-Type: application/json" -d "$JSON PAYLOAD" | jq -r '.response' "$REPORT FILE" END TIME=$ date +%s ELAPSED=$ END TIME - START TIME SLEEP TIME=$ ELAPSED COEFF echo -e "\n\n---\n\n" "$REPORT FILE" echo "✅ Elapsed: ${ELAPSED}s. Rest: ${SLEEP TIME}s." if "$SLEEP TIME" -gt 0 ; then sleep "$SLEEP TIME" fi done echo "==================================================================" echo "🎉 Review successfully completed " echo "==================================================================" The logic of the script: - Get info about which files to check and where they are stored. - Get the file with the prompt content. - Get some optional parameters about filtering, outputs and delays between requests. - For each file: - Read the file and clean it from not meaningful things like comments and empty lines. - Send the file content into the local LLM along with the prompt. - Receive result and save it to the report. - Count the processing time for the file and sleep x2 by default time to cool down the hardware. Outcomes Execution Flow venv %n@%m %1~ % ./scripts/repo-check-1.sh -d scripts -m setup -p scripts/prompt-infrasec.txt ================================================================== 🕵️‍♂️ Starting review... 📂 Final report will be saved to: ./review report 20260521 121530.md ================================================================== ⏳ Analyzing: scripts/setup-quality-gate-iam.sh ... ✅ Elapsed: 6s. Rest: 12s. ⏳ Analyzing: scripts/setup-gcp-details.sh ... ✅ Elapsed: 95s. Rest: 190s. ⏳ Analyzing: scripts/setup-gcp.sh ... ✅ Elapsed: 128s. Rest: 256s. ================================================================== 🎉 Review successfully completed ================================================================== Report 🔍 Analysis results: | Finding / Vulnerability | Recommendation / Fix | |---|---| | Assigning public access legacyObjectReader to GCS bucket | Remove the line gsutil iam ch allUsers:legacyObjectReader "gs://${BUCKET NAME}" to prevent making the bucket publicly accessible. Consider using more restrictive permissions based on your security requirements. | | Hardcoded service account name in the script | Avoid hardcoding sensitive information like service account names. Instead, retrieve them from a secure source or use environment variables. | | Missing encryption settings for GCS bucket | Ensure that the GCS bucket is encrypted by default. Add the --encryption flag to the gsutil mb command if you want to specify a specific encryption type, such as --encryption=DEFAULT . | | No logging and monitoring configurations | Implement logging and monitoring for the resources created. Enable Cloud Logging and Monitoring to track access and usage of the secrets and GCS bucket. | | Using automatic replication policy for secrets | Consider using a more controlled replication policy for secrets. Automatic replication might not be necessary for all use cases, and you should evaluate whether it aligns with your security and compliance requirements. | | Lack of error handling for secret creation | Add proper error handling when creating the secret to ensure that any issues during the creation process are caught and addressed appropriately. | | No version control for secrets | Ensure that secrets have a versioning strategy in place. This allows you to manage changes and roll back to previous versions if needed. | | Potential for misconfiguration of IAM roles | Double-check the IAM roles being assigned to ensure they align with the principle of least privilege. Avoid assigning broader permissions than necessary for the service account. | Conclusion Looks extremely interesting: - The time elapsed is quite good for me. - The LLM answer is quite similar to cloud LLMs. And it was achieved without prompt tuning or additional context manipulations. Further steps planned: - Experiment with models, context windows, prompts and additional contexts. - Check whether it will work on some kind of a local SOHO server for batch tasks.